Topic/Partition allocation for pushing future graph estimations to users.

Hello everyone, I am new to Kafka and would like some advice on picking the number of topics and partitions for an upcoming project I am working on for educational purposes.

Main goal is to for users to log into a personalized feed page with various dataset graph estimations that are updated in real-time. Each user can subscribe to different datasets.

Currently the architecture looks a bit like this:

  1. We have a producer that sends the estimators for the datasets to kafka.
  2. We have a websocket service where clients connect to subscribing to the required dataset estimations.
  3. The websocket service is also a kafka subscriber, it always gets the latest dataset estimation from kafka and sends it to the subscribed clients.

The clients should always be able to see the latest available estimation, and any subsequent updates to them so the service should always keep track of the latest ones to send when clients connect to it.

What I am having some trouble with is deciding on the topic/partition scheme for the estimation messages. Since we want consumers to always get the latest estimation, the messages should probably be key-ed by the dataset identifier so that we don't get out of order estimations.

Other than that I am not sure if i should make multiple topics - one for each dataset or a single topic with few or many partitions.

I am currently leading to going with a single topic since due to the key ordering restriction I would probably end up making multiple single-partition topics if going with the multiple topic scheme.

For the single topic scheme I am not sure how many partitions I should go for. Since currently the only consumer is the websocket service, throughput is limited by the number of clients the service can broadcast to.

If we want to scale this the most natural next step would be to scale the websocket service by making more instances each belonging to a separate consumer group. (perhaps behind a websocket aware load balancer)

Having multiple consumers per consumer group is probably not very useful in this case since scaling on the number of clients by adding more consumer groups would probably be more effective.

Going with multiple partitions I guess I would directly map each dataset to a partition manually, this may be an operational issue when adding more datasets in the future.

All in all I am not sure if I should go with a single or multiple partitions on the single topic, I suspect the biggest factor in the d

... keep reading on reddit ➑

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/DemonInAJar
πŸ“…︎ Jan 08 2022
🚨︎ report
[R] Partition and Code: learning how to compress graphs arxiv.org/abs/2107.01952
πŸ‘︎ 16
πŸ’¬︎
πŸ‘€︎ u/hardmaru
πŸ“…︎ Nov 10 2021
🚨︎ report
[R] Partition and Code: learning how to compress graphs arxiv.org/abs/2107.01952
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/research_mlbot
πŸ“…︎ Nov 10 2021
🚨︎ report
[OC] Interactive Observable Notebook for exploring US States Covid-19 data. Top-K or manually select states to graph. Also includes several ways to partition the states into groups and aggregate the stats for the group, as shown below. Full, live, source code included at link. Updates daily.
πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/davidsmaynard
πŸ“…︎ Jul 16 2020
🚨︎ report
Algorithms for Counting Partitions of Graphs

My apologies if this is a basic question in combinatorics, but I'm a PDE theorist who just so happens to be thinking about this problem.

Suppose I have a graph G=(V,E) with n vertices and let m divide n. I want to create a disjoint partition of V into m sets (V_1, V_2, ... , V_m) each of size n/m subject to the constraint that if v_i and v_j both in V_k, then (v_i,v_j) is in E.

I'd like to be able to count the number of possible disjoint partitions subject to this constraint. Are there results of this type known? If so, could anyone point me to a reference?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/qamorris
πŸ“…︎ Jul 25 2019
🚨︎ report
Graph Theory and Integer Partitions.

I'm looking for a graph that would allow us to choose positive natural weights for all edges so that:

(1) For some N and for each vertex v, the sum of the weights of all edges incident to v is equal to N.

(2) For each vertex v, the weights of all edges incident to v form an [integer partition](https://en.wikipedia.org/wiki/Partition_(number_theory)) of N that is not shared by another vertex.

Here is an illustration.

In the illustration N=8. For every vertex, we have the same partition 1+2+5, so requirement (2) isn't satisfied.

I know that complete graphs can't satisfy (1) and (2). I'm starting to think that no graphs can, but I don't have a proof.

What do you think?

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/128pages
πŸ“…︎ Dec 23 2016
🚨︎ report
Part II of my multi-series Reddit Visualizations. This is a partition graph of Reddit comments. [OC] dev.redditanalytics.com/w…
πŸ‘︎ 31
πŸ’¬︎
πŸ“…︎ Jun 25 2013
🚨︎ report
Haskell for Maths: Faster graph symmetries using distance partitions haskellformaths.blogspot.…
πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/dons
πŸ“…︎ Jul 20 2009
🚨︎ report
Haskell for Maths: Faster graph symmetries using distance partitions haskellformaths.blogspot.…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/dons
πŸ“…︎ Jul 20 2009
🚨︎ report
Graph partitioning using Metis

Hi, I'm looking at the ways of partitioning the graphs. I found gpmetis online to do so. I was able to successfully partition a graph as per my requirement( edgecut =2). However I want to know the vertices where the edgecut has been made by the Metis. How do I figure it out? Should I use any options to view the vertices involved in the cut? If not Metis, any other alternatives? Thanks

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Zeit-reisender
πŸ“…︎ Sep 28 2021
🚨︎ report
[D] Distributed Graph Partitioning Algorithms

Does anyone here work and keep up to date with the current research on graph partitioning methods? I'm looking to a way to partition a graph and minimize edges between partitions. The most common algorithm seems to be METIS, its implemented in a lot of different libraries like DGL, but it seems to be bounded to one machine

I'm looking to partition up to a trillion edges, so seeing if there's consensus on a distributed algorithm / framework for the graph partitioning problem. If not I'll try running METIS on an x1e.32xlarge host with 4tb of ram and see if that's enough.

The end goal is to build large knowledge graphs for building graph neural networks

πŸ‘︎ 8
πŸ’¬︎
πŸ‘€︎ u/wagthesam
πŸ“…︎ Aug 26 2021
🚨︎ report
scCorr: A graph-based k-partitioning approach for single-cell gene-gene correlation analysis [bioRxiv] reddit.com/r/BiologyPrepr…
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Ginkgopsida
πŸ“…︎ Mar 16 2021
🚨︎ report
Compression utility using graph partitioning

I wrote this tiny (~350 lines of code) lossless compression algorithm based on partitioning a graph into clusters: https://github.com/rand3289/GraphCompress

I tried using it on data from /dev/urandom and of course the graph metadata exceeds compression... I have not tried it on other types of data yet.

The algorithm is very simple: As a file is read, it represents a path through a graph. Later I partition the graph into clusters and optimize to have the least number of hyper edges (edges between clusters). This way internal vertexes can be represented as cluster indexes instead of a global vertex ID (go from 16 bit to 8 bit). This action creates lots of bytes with 0 and the file can now be compressed with any compression utility.

I do not know much about compression and was wondering if this is an existing technique?

πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/rand3289
πŸ“…︎ Nov 11 2020
🚨︎ report
Graph Partitioning but minimize matrix vector product error?

Graph Partitioning is usually done in such a way to minimize the difference in the edge weights.

If the adjacency matrix for the graph is A then I want to find a partitioned graph A' that minimizes the error on average between (Ax - A'x). Is that the same thing as partitioning based on minimizing the edge cuts? It seems like minimizing the edge cuts is minimizing the L1 norm of (A-A'), but minimizing the squared error is minimizing the L2 norm of (A-A'). Does spectral partitioning do both?

πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/Steve132
πŸ“…︎ Nov 13 2020
🚨︎ report
Graph Partitioning problem help

Let's say that I have a group of 100 people. Let's label them 1,2,3..100. People talk to some people and don't talk to others. For example 1 talks with 5,18,19,54,89 and does not talk with the others. How do I split this group of 100 people into two parts A and B of 50 people each such that cross talk from group A to group B is minimized. (Cross talk is when let's say 2 talks to 8 who belong to different groups)

I have been pondering about this problem for a long time. Please enlighten me with some concepts that might help in solving this. Comment for any clarification.

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/paanchiboi
πŸ“…︎ Feb 11 2020
🚨︎ report
The only and the ULTIMATE crypto dictionary that you will ever need

About a month ago I compiled a long crypto dictionary for people who are either new, out of the loop or just lurking around. As we are witnessing a massive influx of new crypto investors and users here, I was asked by the Redditors to post the updated version after Christmas. After somehow catching COVID a few days ago, I have a lot of free time on my hands so I decided to update the dictionary with all the suggestions people added in the previous post and a lot of new content to help out the newbies and veterans alike.

With expressions like "HODL", "FOMO", "DAO", "ATH", "LP" we can sometimes sound alien to people who aren't into cryptospace.

Are you new to crypto and want to join in the conversation? Lets go boys and girls, grab your cocoa and get cosy cause this is gonna be a long ride. ^((ps. please send some bandages for my fingers and carpal tunnel treatment braces for my poor wrists).)

ABBREVATION EXPLANATION
Airdrop It's not C17 dropping a Humvee onto you. It's a giveaway for holders of certain crypto or for founders. In short: free coins!
Altcoin / Alts An altcoin is any coin that’s not Bitcoin (however nowadays Eth is not quite considered an altcoin anymore).
AMM Automated Market Maker. A kind of decentralized exchange platform (or DEX).
ATH All time high aka the highest price of an asset. Also known as, the time when we decide to buy.
APY The annual percentage yield (APY) is a method of calculating the amount of money earned on a money market account over the course of a year. To put it another way, this is a technique to track how interest accumulates over time. (1). Reminder: High APY is usually high risk project.
ASA Added by u/HoleyBody : ASA - Algorand Standard Asset. ASA is to ALGO as ERC-20 is to ETH
ATL All time low aka the lowest price of an asset so far. Also known as, the time when we feel like it's really time to sell.
Bear Market Bear market is a declining market.
Bull Market Bull market is a rising market.
Buy the dip When crypto dips, it's a good time to buy. Hence, buy the dip and have some tasty nachos read in case of drama.
Bubble Explained by u/PhilTheQuant : the state of a market going up only because people are seeking to join in with a market going up, not due to any permanent or underlying increase in value. Characterised by increased personal debt and an influx of investors with little or no market knowledge.
... keep reading on reddit ➑

πŸ‘︎ 199
πŸ’¬︎
πŸ‘€︎ u/DaddySkates
πŸ“…︎ Jan 15 2022
🚨︎ report
Our free online course in graph theory from Stanford called 'Graph Partitioning and Expanders' starts today! venture-lab.org/expanders
πŸ‘︎ 89
πŸ’¬︎
πŸ‘€︎ u/novoed
πŸ“…︎ Apr 23 2013
🚨︎ report
Is there an algorithm for this already?

Hello all,

I've got a problem and I'm trying to determine whether there is already an algorithm or equation out there that will solve this problem, or if I will have to engineer one myself.

(Note: I'm not asking for anyone to write any code or do any math for me and this isn't for a class. Just looking for the name of the algorithm, if it exists).

Here is an abstracted, general description of the problem:

-You have a grid of blocks. Each block has a given number of dots in it, and the dots are color-coded in one of two colors. Find the set of block groups such that:

  1. All blocks are used
  2. No block can be used more than once
  3. You can only make groups of blocks that are next to each other (if block a <-> block b <-> block c, but a is not connected to c, a <-> b <-> c is still a valid combo.)
  4. The number of combined dots in the block group cant be over or under a certain threshold
  5. There has to be exactly a certain given number of block groups (in this case 150)
  6. The final set of block groups returned is the one where, given all of the other conditions, each group has the most equal balance between the colors of dots.

(I suspect the answer lies somewhere in graph theory, I just don't know enough of it to know whether this exists or not. My background is in data science/analytics and computational journalism.)

Thanks!

πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/as9934
πŸ“…︎ Jan 15 2022
🚨︎ report
Graph Partitioning k-way

I was learning about NP problems and ran into k-way graph partitioning. Basically we are trying to make sets of max size n while trying to reduce the number of edges deleted from the partitioning. How would graph partitioning work if you didn't want certain nodes together in the same partition?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/jasonlolwut
πŸ“…︎ Dec 02 2018
🚨︎ report
[D] Sparsest Cut in practice?

Sparsest cut, and its close variant "balanced cut", have been studied for ages in the computer science theory community, and there are algorithms to compute this which run fast. Sparsest cut is often billed as a "graph clustering" algorithm.

Sparsest cut gives rise to an obvious data clustering algorithm: build a graph on your data (like kNN or threshold graph), and run sparsest cut to get a 2-way partition. To do a k-way partition, you could recursively do this. From what I know, sparsest cut intuition is one thing that guided the creation of spectral clustering, a widely used clustering method.

HHas anyone tried implementing this in practice? If so, how does it do on data? If not, why not? I have searched extensively on Google for any implementations of this sparsest-cut style clustering, but I haven't found any. Given that sparsest cut is one of the most well-studied problems in computer theory, I was surprised that no implementation of this clustering method exists.

πŸ‘︎ 2
πŸ’¬︎
πŸ“…︎ Jan 06 2022
🚨︎ report
Algorithm for Mincut in Networkx? How many edges have to be deleted so that the graph is partitioned

I want to find the green cut in this picture and the function should return 2.

https://upload.wikimedia.org/wikipedia/commons/c/c0/Min_cut_example.svg

My Graph has no flows and is undirected. The Picture is from Karger's algorithm but it seems the algorithm is random and I need correct values ( I don't care about runtime)

In Networkx I found this https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.flow.minimum_cut.html but it seems that is a different mincut with flows and a start and target node, not for the whole graph

πŸ‘︎ 13
πŸ’¬︎
πŸ‘€︎ u/G_fucking_G
πŸ“…︎ Feb 18 2018
🚨︎ report
[University Graph Partitioning] Matching components of fiedler vector to nodes?

I've been reading about the algebraic connectivity of graphs (aka fiedler vector/eigenvector) and how people use it to partition graphs; but none of the examples or papers show how to map components of the fiedler vector to nodes in the graph (or the edges to cut). Does anyone know of a good review paper or blog post on this?

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/CocoBashShell
πŸ“…︎ Apr 18 2017
🚨︎ report
Calc 2 Finding arc lengths of a graph, partitioning along x-axis

http://imgur.com/GPeLDft

"9y^2 = 4x^3 from x=0 to x=1; y> or = 0"

Not sure where I'm struggling here... I know I got the formula part right but I seem to be doing something wrong in my calculation and/or finding the antiderivative. Correct answer is (4sqrt(2) - 2)/3

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Miokien
πŸ“…︎ Feb 14 2017
🚨︎ report
SERIOUS: This subreddit needs to understand what a "dad joke" really means.

I don't want to step on anybody's toes here, but the amount of non-dad jokes here in this subreddit really annoys me. First of all, dad jokes CAN be NSFW, it clearly says so in the sub rules. Secondly, it doesn't automatically make it a dad joke if it's from a conversation between you and your child. Most importantly, the jokes that your CHILDREN tell YOU are not dad jokes. The point of a dad joke is that it's so cheesy only a dad who's trying to be funny would make such a joke. That's it. They are stupid plays on words, lame puns and so on. There has to be a clever pun or wordplay for it to be considered a dad joke.

Again, to all the fellow dads, I apologise if I'm sounding too harsh. But I just needed to get it off my chest.

πŸ‘︎ 17k
πŸ’¬︎
πŸ‘€︎ u/anywhereiroa
πŸ“…︎ Jan 15 2022
🚨︎ report
Scene graphs and spatial partitioning structures: What do you really need?

I've been fiddling with 2D games for awhile and I'm trying to go into 3D game development. I thought I should get my basics right first.

From what I read scene graphs hold your game objects/entities and their relation to each other like 'a tire' would be the child of 'a vehicle'. It's mainly used for frustum/occlusion culling and minimizing the collision checks between the objects.

Spatial partitioning structures on the other hand are used to divide a big game object (like the map) to smaller parts so that you can gain performance by only drawing the relevant polygons and again minimizing the collision checks to those polygons only. Also a spatial partitioning data structure can be used as a node in a scene graph.

But... I've been reading about both subjects and I've seen a lot of "scene graphs are useless" and "BSP performance gain is irrelevant with modern hardware" kind of articles.

Also some of the game engines I've checked like gameplay3d and jmonkeyengine are only using a scene graph (That also may be because they don't want to limit the developers). Whereas games like Quake and Half-Life only use spatial partitioning.

I'm aware that the usage of these structures very much depend on the type of the game you're developing so for the sake of clarity let's assume the game is a FPS like Counter-Strike with some better outdoor environment capabilities (like a terrain).

The obvious question is which one is needed and why (considering the modern hardware capabilities).

Thank you.

πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/almbfsek
πŸ“…︎ Oct 30 2012
🚨︎ report
van layout- anyone have seats at the sliding door?

So I've been playing around on vanspace... like way too much time haha (especially since I'm not digitally savvy in that way and prefer my graph paper and pencils) anyway- I have a few different layouts for my new build and the one I am currently leaning towards is not one I've seen in other vans. I think I have a screenshot of one sort of like it in the UK but forgot to include where I found it.

I am doing a 148 high roof transit- but NOT extended, which means it is a much smaller space than so many layouts these days show. It is plenty room for me and a huge upgrade from my current regular roof van ford e350.

In all of the imaginings of what I need to be able to do in the van (including work, with a zoom background that is NOT my bed) and the other logistics, I really want to have seats with a window view and also when door is open, but I don't want to use the layout with the seats on the driver side as I feel it takes up too much space.

I am going to do a partition with an opening/ door, no swivel seats. And yes I know if I did swivel seats it would take care of this problem and give me more room, but I have my reasons.

So just wanted to see if anyone has done something like this, and do you have photos? Have you tried it and it didn't work or you did and love it?

Obviously I would make use of the box bench seats and figure out how to correctly situation either a Lagun table or something. I'm also thinking about designing a swivel back on the box at the door front so it have the back move to face outside or in...

OK what do you think? (the kitchen would be on the opposite wall) (also ignore the shelves and plants I was just playing around for vibe that's not technical at all haha)

View of the passenger side sliding door (outline of door should be around that window)

πŸ‘︎ 14
πŸ’¬︎
πŸ‘€︎ u/41bluets
πŸ“…︎ Jan 21 2022
🚨︎ report
Algorithms: Help with partitioning a graph's vertices into two sets of equal sizes.

Hi guys, I came across an interesting problem today and I'd like some help solving it.

>Given a graph, partition its vertices into two non-empty >sets of equal sizes and make sure that the crossing edges have >minimum weight.

I have been reading about min-cut algorithms, especially this but I thought that asking for help or more resources here would be beneficial.

Any help regarding how I solve this would be appreciated. Thanks.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/ashit_singh_nith
πŸ“…︎ Apr 06 2015
🚨︎ report
Blind Girl Here. Give Me Your Best Blind Jokes!

Do your worst!

πŸ‘︎ 5k
πŸ’¬︎
πŸ‘€︎ u/Leckzsluthor
πŸ“…︎ Jan 02 2022
🚨︎ report
French fries weren’t cooked in France.

They were cooked in Greece.

πŸ‘︎ 9k
πŸ’¬︎
πŸ“…︎ Jan 20 2022
🚨︎ report
This subreddit is 10 years old now.

I'm surprised it hasn't decade.

πŸ‘︎ 14k
πŸ’¬︎
πŸ‘€︎ u/frexyincdude
πŸ“…︎ Jan 14 2022
🚨︎ report
You've been hit by
πŸ‘︎ 6k
πŸ’¬︎
πŸ‘€︎ u/mordrathe
πŸ“…︎ Jan 20 2022
🚨︎ report
Dropped my best ever dad joke & no one was around to hear it

For context I'm a Refuse Driver (Garbage man) & today I was on food waste. After I'd tipped I was checking the wagon for any defects when I spotted a lone pea balanced on the lifts.

I said "hey look, an escaPEA"

No one near me but it didn't half make me laugh for a good hour or so!

Edit: I can't believe how much this has blown up. Thank you everyone I've had a blast reading through the replies πŸ˜‚

πŸ‘︎ 20k
πŸ’¬︎
πŸ‘€︎ u/Vegetable-Acadia
πŸ“…︎ Jan 11 2022
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.