Data Structures And Algorithms | 2019-03-02

binarytree (33)in #algorithms • 7 years ago

Data Structures And Algorithms

Fair Dimensionality Reduction and Iterative Rounding for SDPs (1902.11281v1)

Jamie Morgenstern, Samira Samadi, Mohit Singh, Uthaipon Tantipongpipat, Santosh Vempala

2019-02-28

We model "fair" dimensionality reduction as an optimization problem. A central example is the fair PCA problem: the input data is divided into groups, and the goal is to find a single -dimensional representation for all groups for which the maximum variance (or minimum reconstruction error) is optimized for all groups in a fair (or balanced) manner, e.g., by maximizing the minimum variance over the groups of the projection to a -dimensional subspace. This problem was introduced by Samadi et al. (2018) who gave a polynomial-time algorithm which, for groups, returns a -dimensional solution of value at least the best -dimensional solution. We give an exact polynomial-time algorithm for groups. The result relies on extending results of Pataki (1998) regarding rank of extreme point solutions to semi-definite programs. This approach applies more generally to any monotone concave function of the individual group objectives. For groups, our results generalize to give a -dimensional solution with objective value as good as the optimal -dimensional solution for arbitrary in polynomial time. Using our extreme point characterization result for SDPs, we give an iterative rounding framework for general SDPs which generalizes the well-known iterative rounding approach for LPs. It returns low-rank solutions with bounded violation of constraints. We obtain a -dimensional projection where the violation in the objective can be bounded additively in terms of the top -singular values of the data matrices. We also give an exact polynomial-time algorithm for any fixed number of groups and target dimension via the algorithm of Grigoriev and Pasechnik (2005). In contrast, when the number of groups is part of the input, even for target dimension , we show this problem is NP-hard.

Dynamic Planar Convex Hull (1902.11169v1)

Riko Jacob, Gerth Stølting Brodal

2019-02-28

In this article, we determine the amortized computational complexity of the planar dynamic convex hull problem by querying. We present a data structure that maintains a set of n points in the plane under the insertion and deletion of points in amortized O(log n) time per operation. The space usage of the data structure is O(n). The data structure supports extreme point queries in a given direction, tangent queries through a given point, and queries for the neighboring points on the convex hull in O(log n) time. The extreme point queries can be used to decide whether or not a given line intersects the convex hull, and the tangent queries to determine whether a given point is inside the convex hull. We give a lower bound on the amortized asymptotic time complexity that matches the performance of this data structure.

Quantum walk inspired algorithm for graph similarity and isomorphism (1902.11105v1)

Callum Schofield, Jingbo B. Wang, Yuying Li

2019-02-28

Large scale complex systems, such as social networks, electrical power grid, database structure, consumption pattern or brain connectivity, are often modeled using network graphs. Valuable insight can be gained by measuring the similarity between network graphs in order to make quantitative comparisons. Since these networks can be very large, scalability and efficiency of the algorithm are key concerns. More importantly, for graphs with unknown labeling, this graph similarity problem requires exponential time to solve using existing algorithms. In this paper, we propose a quantum walk inspired algorithm, which provides a solution to the graph similarity problem without prior knowledge on graph labeling. This algorithm is capable of distinguishing between minor structural differences, such as between strongly regular graphs with the same parameters. The algorithm has polynomial complexity, scaling with .

Ratio-Balanced Maximum Flows (1902.11047v1)

Hannaneh Akrami, Kurt Mehlhorn, Tommy Odland

2019-02-28

When a loan is approved for a person or company, the bank is subject to \emph{credit risk}; the risk that the lender defaults. To mitigate this risk, a bank will require some form of \emph{security}, which will be collected if the lender defaults. Accounts can be secured by several securities and a security can be used for several accounts. The goal is to fractionally assign the securities to the accounts so as to balance the risk. This situation can be modelled by a bipartite graph. We have a set of securities and a set of accounts. Each security has a \emph{value} and each account has an \emph{exposure} . If a security can be used to secure an account , we have an edge from to . Let be part of security 's value used to secure account . We are searching for a maximum flow that send at most units out of node and at most units into node . Then is the unsecured part of account . We are searching for the maximum flow that minimizes .

On the Area Requirements of Planar Straight-Line Orthogonal Drawings of Ternary Trees (1902.11044v1)

Barbara Covella, Fabrizio Frati, Maurizio Patrignani

2019-02-28

In this paper, we study the area requirements of planar straight-line orthogonal drawings of ternary trees. We prove that every ternary tree admits such a drawing in sub-quadratic area. Further, we present upper bounds, the outcomes of an experimental evaluation, and a conjecture on the area requirements of planar straight-line orthogonal drawings of complete ternary trees. Finally, we present a polynomial lower bound on the length of the minimum side of any planar straight-line orthogonal drawing of a complete ternary tree.

A fast algorithm for constructing balanced binary search trees (1902.02499v3)

Pavel S. Ruzankin

2019-02-07

We suggest a new non-recursive algorithm for constructing a binary search tree given an array of numbers. The algorithm has time and memory complexity if the given array of numbers is sorted. The resulting tree is of minimal height and can be transformed to a complete binary search tree (retaining minimal height) with time and memory. The algorithm allows simple and effective parallelization.

Fast Concurrent Data Sketches (1902.10995v1)

Arik Rinberg, Alexander Spiegelman, Edward Bortnikov, Eshcar Hillel, Idit Keidar, Hadar Serviansky

2019-02-28

Data sketches are approximate succinct summaries of long streams. They are widely used for processing massive amounts of data and answering statistical queries about it in real-time. Existing libraries producing sketches are very fast, but do not allow parallelism for creating sketches using multiple threads or querying them while they are being built. We present a generic approach to parallelising data sketches efficiently, while bounding the error that such parallelism introduces. Utilising relaxed semantics and the notion of strong linearisability we prove our algorithm's correctness and analyse the error it induces in two specific sketches. Our implementation achieves high scalability while keeping the error small.

Graph and String Parameters: Connections Between Pathwidth, Cutwidth and the Locality Number (1902.10983v1)

Katrin Casel, Joel D. Day, Pamela Fleischmann, Tomasz Kociumaka, Florin Manea, Markus L. Schmid

2019-02-28

We investigate the locality number, a recently introduced structural parameter for strings (with applications in pattern matching with variables), and its connection to two important graph-parameters, cutwidth and pathwidth. These connections allow us to show that computing the locality number is NP-hard but fixed parameter tractable (when the locality number or the alphabet size is treated as a parameter), and can be approximated with ratio . As a by-product, we also relate cutwidth via the locality number to pathwidth, which is of independent interest, since it improves the currently best known approximation algorithm for cutwidth. In addition to these main results, we also consider the possibility of greedy-based approximation algorithms for the locality number.

Lower Bounds for Multiplication via Network Coding (1902.10935v1)

Peyman Afshani, Casper Benjamin Freksen, Lior Kamma, Kasper Green Larsen

2019-02-28

Multiplication is one of the most fundamental computational problems, yet its true complexity remains elusive. The best known upper bound, by F"{u}rer, shows that two -bit numbers can be multiplied via a boolean circuit of size , where is the very slowly growing iterated logarithm. In this work, we prove that if a central conjecture in the area of network coding is true, then any constant degree boolean circuit for multiplication must have size , thus almost completely settling the complexity of multiplication circuits. We additionally revisit classic conjectures in circuit complexity, due to Valiant, and show that the network coding conjecture also implies one of Valiant's conjectures.

Improved algorithms for Correlation Clustering with local objectives (1902.10829v1)

Sanchit Kalhan, Konstantin Makarychev, Timothy Zhou

2019-02-27

Correlation Clustering is a powerful graph partitioning model that aims to cluster items based on the notion of similarity between items. An instance of the Correlation Clustering problem consists of a graph (not necessarily complete) whose edges are labeled by a binary classifier as "similar" and "dissimilar". Classically, we are tasked with producing a clustering that minimizes the number of disagreements: an edge is in disagreement if it is a "similar" edge and is present across clusters or if it is a "dissimilar" edge and is present within a cluster. Define the disagreements vector to be an dimensional vector indexed by the vertices, where the -th index is the number of disagreements at vertex . Recently, Puleo and Milenkovic (ICML '16) initiated the study of the Correlation Clustering framework in which the objectives were more general functions of the disagreements vector. In this paper, we study algorithms for minimizing norms of the disagreements vector for both arbitrary and complete graphs. We present the first known algorithm for minimizing the norm of the disagreements vector on arbitrary graphs and also provide an improved algorithm for minimizing the norm of the disagreements vector on complete graphs. Finally, we compliment these two algorithmic results with some hardness results.