Insane performance! - "AMD EPYC 7773X Milan-X Flagship CPU Benchmarked In Dual-Socket Configuration, Scores Almost 30,000 Multi-Threaded Points in CPU-z" - Intel's no chance of catching up!

Initial benchmarks of AMD's new flagship probably could be better as this is on engineering samples:

https://wccftech.com/amd-epyc-7773x-milan-x-flagship-cpu-benchmarked-in-dual-socket-configuration-scores-almost-30000-multi-threaded-points-in-cpu-z/

Milan is already killing Intel's FUTURE servers planned this year after delays.. now this... Note expected price higher at $9000 vs Chinese ES on sale..

That's what KeyBanc and Wells Fargo talked about regarding AMD's datacenters revenue Ryzen in 2022 getting higher market share. The PPS will follow.

πŸ‘︎ 19
πŸ’¬︎
πŸ‘€︎ u/TOMfromYahoo
πŸ“…︎ Jan 11 2022
🚨︎ report
A Multi-threaded CPU-implementation of SPH Viscoelastic Fluid Simulation that I wrote in C++ with GLFW github.com/consequencesun…
πŸ‘︎ 80
πŸ’¬︎
πŸ“…︎ Dec 24 2021
🚨︎ report
Oracle Working On Multi-Threaded VFIO Page Pinning For ~10x Faster QEMU Initialization - Phoronix phoronix.com/scan.php?pag…
πŸ‘︎ 75
πŸ’¬︎
πŸ‘€︎ u/tmd_h
πŸ“…︎ Jan 06 2022
🚨︎ report
A Multi-threaded CPU-implementation of SPH Viscoelastic Fluid Simulation that I wrote in C++ with GLFW github.com/consequencesun…
πŸ‘︎ 28
πŸ’¬︎
πŸ“…︎ Dec 20 2021
🚨︎ report
Alleged Intel Core i5-12600K Alder Lake CPU Benchmark Shows 50% Higher Multi-Threaded Performance Versus AMD Ryzen 5 5600X wccftech.com/intel-core-i…
πŸ‘︎ 202
πŸ’¬︎
πŸ‘€︎ u/three_dots--
πŸ“…︎ Oct 23 2021
🚨︎ report
Dual Chinese Zen-Based CPUs Beat Ryzen 5 5600X In Multi-Threaded Workloads tomshardware.com/news/dua…
πŸ‘︎ 16
πŸ’¬︎
πŸ‘€︎ u/Long_on_AMD
πŸ“…︎ Dec 18 2021
🚨︎ report
Intel Core i9-12900K Flagship Alder Lake CPU Benchmarks Leak Out Again, Fastest Single-Threaded Chip & Right On Par With AMD Ryzen 9 5950X In Multi-Threaded Tests wccftech.com/intel-core-i…
πŸ‘︎ 163
πŸ’¬︎
πŸ‘€︎ u/three_dots--
πŸ“…︎ Oct 09 2021
🚨︎ report
AMD EPYC 7773X Milan-X Flagship CPU Benchmarked In Dual-Socket Configuration, Scores Almost 30,000 Multi-Threaded Points in CPU-z wccftech.com/amd-epyc-777…
πŸ‘︎ 20
πŸ’¬︎
πŸ‘€︎ u/long_AMD
πŸ“…︎ Jan 11 2022
🚨︎ report
GitHub - gregyjames/OctaneDownloader: A high performance, multi-threaded C# file download library. github.com/gregyjames/Oct…
πŸ‘︎ 44
πŸ’¬︎
πŸ‘€︎ u/Nero8
πŸ“…︎ Dec 08 2021
🚨︎ report
Multi threaded bolt
πŸ‘︎ 14k
πŸ’¬︎
πŸ“…︎ Aug 02 2021
🚨︎ report
Multi-threaded google unit test failing 1/10,000 times

It's a deadlock issue. Originally it was 1/20. Made some changes then 1/100. Now it's 1/10000. It's incredibly difficult to reproduce it. My question is, is it even worth debugging/fixing if it's so rare?

πŸ‘︎ 10
πŸ’¬︎
πŸ‘€︎ u/doctorDoakHead
πŸ“…︎ Nov 30 2021
🚨︎ report
[Share] App: Multi-threaded download accelerator & video grabber apps.apple.com/us/app/fge…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Ologon
πŸ“…︎ Jan 03 2022
🚨︎ report
Theory for multi-threaded MC, or any large voxel/pixel program

Preface

I realize multi-threaded Minecraft is a bit of a meme, but I want to discuss some theories around how it could be done if you had lots of money and time and such. I came across this because I was working on an old cellular automata project and realized I had no way to make it multi-threaded. If I could make it work on my project, the same thing would β€” in theory β€” work in Minecraft, right? I might try implementing the best proposal in my cellular automata project just for the heck of it, so this is not purely academic.

This is also extremely programming-heavy and doesn't really involve FeedTheBeast in any way, but this is the only place where I know I can find experienced Minecraft programmers. I know programming very well in general, but I don't know anything about Minecraft source code. And finally, I'm sure this problem has been solved somewhere before, but I haven't been able to find it.

Some specifics:

- For the sake of argument, write this in any framework/language. For example, write it for Bedrock, Minestorm, or a Rust rewrite. It doesn't even have to be for Minecraft, just any voxel-based environment of sufficient complexity, like Minetest.

- True arbitrary multiprocessing. As threads/cores/memory approach infinity, so should performance. Putting lighting or chunk loading on a different thread doesn't count.

- Two block updates in different chunks which do not affect each other should usually run on separate threads.

- Focus on server only.

- Arbitrary number of players in a world is more important than arbitrary performance in a single area. 5000 players > 5000 cows.

Dynamically shaped regions

Vaguely, each player gets their own region. All of the chunks in their region are processed on a separate thread. As they move around, their region moves with them. If two players walk close enough that their regions touch, you get a region sync and merge. This means no matter how many players are at a base, it still only runs one thread. All block updates that exit a region add those chunks to the previous region. When a redstone line runs thousands of blocks into unloaded chunks, the region reshapes to include all of them. If the redstone line activates a chunk loader, then all of those chunks get added to the region as well. Each dimension would always have separate regions, so going to the nether would not cause a region merge, it would just move the play

... keep reading on reddit ➑

πŸ‘︎ 96
πŸ’¬︎
πŸ‘€︎ u/livefrmhollywood
πŸ“…︎ Oct 25 2021
🚨︎ report
Multi threaded rendering appreciation post!!

Thank you ZOS!! Finally!! I need to test this more but so far this has fixed any weird fps drops that I used to get for no apparent reason!! Would dip as low to 40 fps before in random locations!! Im runnin a strix 2080ti i9 9900k and 32 gigs of ram. 4k with a ton of addons, reshade,unlimited draw distance mod, and set mips to -3. Game is smooth as butter on the 65 inch 4k tv @60fps.

I was baffled that my xbox series X version ran smoother in towns. This is no longer the case! Now all I need is the option to have all my crown purchases and CP be shared between platforms!! I dont give a fuck about losing my xbox toons and inventory, but if they could find a way for just CP and crown purchases to be shared across platforms I would be in heaven!!

πŸ‘︎ 216
πŸ’¬︎
πŸ‘€︎ u/Brunz514
πŸ“…︎ Aug 23 2021
🚨︎ report
Multi threaded bolt
πŸ‘︎ 2k
πŸ’¬︎
πŸ‘€︎ u/killHACKS
πŸ“…︎ Aug 02 2021
🚨︎ report
Why can't I manage to get this very simple multi-threaded program to be faster?

This simple program:

#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <time.h>

#define N_THREADS 4 //(4*8)
#define LIMIT 1000000000
#define CONSECUTIVE_ITERS (LIMIT/N_THREADS)

long RESULTS_THREADS[N_THREADS];
clock_t TIMES_THREADS[N_THREADS];

void* threadRoutine(void* arg) {
    clock_t start = clock();

    int threadIndex = *((int*) arg);
    free(arg);
    
    int i = threadIndex * CONSECUTIVE_ITERS;
    
    long totalSum = 0;
    for (int j = i; j < i + CONSECUTIVE_ITERS; ++j)
        totalSum += j;

    RESULTS_THREADS[threadIndex] = totalSum;
    TIMES_THREADS[threadIndex] = clock() - start;
}

int main() {
    clock_t start_parallel = clock();

    pthread_t threads[N_THREADS];
    for(int i = 0; i < N_THREADS; ++i) {
        int* index = malloc(sizeof(int));
        *index = i;
        pthread_create(&threads[i], NULL, threadRoutine, (void*) index);
    }    

    long result_parallel = 0;
    for(int i = 0; i < N_THREADS; ++i) {
        pthread_join(threads[i], NULL);
        result_parallel += RESULTS_THREADS[i];
    }

    double time_parallel = ((double) (clock() - start_parallel))/CLOCKS_PER_SEC;

    clock_t start_sequential = clock();

    long result_sequential = 0;
    for(int k = 0; k < LIMIT; ++k)
        result_sequential += k;

    double time_sequential = ((double) (clock() - start_sequential))/CLOCKS_PER_SEC;

    printf("parallel: \t\t%ld\n", result_parallel);
    printf("Sequential: \t\t%ld\n\n", result_sequential);

    printf("Time parallel:   %fs\n", time_parallel);
    for (int i = 0; i < N_THREADS; ++i) {
        double time_thread = ((double) TIMES_THREADS[i])/CLOCKS_PER_SEC;
        printf("Time thread %d:   %fs\n", i, time_thread);
    }

    printf("Time sequential: %fs\n", time_sequential);

    exit(EXIT_SUCCESS);
}

Gives this result:

Time parallel:   2.043344s
Time thread 0:   2.043192s
Time thread 1:   2.032970s
Time sequential: 2.212173s

Which is something that I just don't understand. I have a AMD Ryzen 5 3550H, so 4 cores. This should be faster, because each individual thread does half the work as the main thread, but they take practically the same! And it actually g

... keep reading on reddit ➑

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/Andy12_
πŸ“…︎ Nov 19 2021
🚨︎ report
OTAKHI.Speedo: A General Purpose Multi-Threaded/SIMD Acceleration Engine youtube.com/watch?v=E2LCp…
πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/mfdesigner
πŸ“…︎ Dec 16 2021
🚨︎ report
Multi threaded bolt
πŸ‘︎ 2k
πŸ’¬︎
πŸ‘€︎ u/AmerBekic
πŸ“…︎ Aug 02 2021
🚨︎ report
TypeScriptBC - A specification for a derivative of TS that produces tiny binaries, introduces a Rust-like borrow checker, and has Go-like multi-threaded concurrency (please contribute!) github.com/alshdavid/Type…
πŸ‘︎ 50
πŸ’¬︎
πŸ‘€︎ u/apatheticonion
πŸ“…︎ Sep 29 2021
🚨︎ report
Clio: a functional, multi-threaded programming language that compiles to JavaScript

I've been working on a functional programming language in the past few years and I'd like to share it with you, would be nice to have some feedback on it! The language is called "Clio" and you can find it here: https://github.com/clio-lang/clio or here: https://clio-lang.org

It has a minimal and noise-free syntax, a minimal type system, and also a gradual type checking system. It has a few innovations, for example, remote functions and built-in support for clustering and making distributed systems. It compiles to JavaScript, it's super fast [1], and it brings multi-threading to the browser.

Let me know what you think, any feedback is appreciated. I'm looking forward to hearing out your opinions so I can improve my language!

[1] https://pouyae.medium.com/clio-extremely-fast-multi-threaded-code-on-the-browser-e78b4ad77220

πŸ‘︎ 24
πŸ’¬︎
πŸ‘€︎ u/pouyae
πŸ“…︎ Oct 11 2021
🚨︎ report
yandex/odyssey: Scalable PostgreSQL connection pooler - Advanced multi-threaded PostgreSQL connection pooler and request router. github.com/yandex/odyssey
πŸ‘︎ 23
πŸ’¬︎
πŸ‘€︎ u/binaryfor
πŸ“…︎ Nov 12 2021
🚨︎ report
Fairly low power, multi-threaded home server suggestion for Proxmox and container workloads?

I'm looking for a little advice on a home server. I'm currently running a Synology DS720+ for file storage, Plex and PiHole. Then have a Dell OptiPlex 3070 (i5-9500T - 6 cores / 6 threads) running a bunch of containers.

I want something a bit more beefy to replace the OptiPlex, specifically, I want more cores with hyperthreading and more than a single NIC. I'd like to run Proxmox for a few VMs:

  • PiHole VM on one NIC.
  • A seperate 'apps' VM on another NIC for my media stack, git etc.
  • Be able to throw up a few VMs for Kubernetes nodes - purely for testing and learning, nothing serious.
  • Potentially a small minecraft server for my friends and I.
  • Potentially a code build pipeline in the future.
  • Potentially bin off my NAS and run FreeNAS or something in its place on this server.

I'm concious of my power bill so I'd like something that can be fairly low power idling but still have a decent number of cores with hyperthreading.

I can't find any 1L / Micro form factor PCs that offer more than a single NIC. I see there's an Intel NUC 11 with duel NICs coming soon which is an option, but is on the more pricey side and would rule out the potential of running my NAS on the same machine due to the storage constaints.

I've looked a little bigger into the HPE MicroServer Gen10, but while the form factor, power consumption and HDD space would work for me, the CPU's a little weak. ServeTheHome has a great piece detailing upgrade options, with the Xeon E-2236 being the desired CPU, it's Β£270 on top of buying the machine for aroung Β£470 and that's before getting an SSD and RAM.

I could build something, and I'm open to that having built gaming PCs over the years, but not really sure what to look for with the CPU / Mobo / PSU for a home server.

TLDR - I want a smallish form factor home server with 6-8 cores hyperthreaded (12-16 threads) with multiple NICs, 2 would work, 4 would be perfect. What can you recommend?

πŸ‘︎ 14
πŸ’¬︎
πŸ“…︎ Oct 05 2021
🚨︎ report
Set up Verbose logs for multi-threaded standalone applications. darkchestnut.com/2021/set…
πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/dzecniv
πŸ“…︎ Nov 22 2021
🚨︎ report
yandex/odyssey: Scalable PostgreSQL connection pooler - Advanced multi-threaded PostgreSQL connection pooler and request router. github.com/yandex/odyssey
πŸ‘︎ 35
πŸ’¬︎
πŸ‘€︎ u/binaryfor
πŸ“…︎ Nov 12 2021
🚨︎ report
Async with futures in a multi-threaded Scala app

If I have 10 blocking operations and 5 threads, I get that I can only run 5 of those operations at once, and none of my threads are free til the operation they were given completes.

I also get that if I make those operations non-blocking, I'm no longer limited in this way: I give an operation to a thread, it kicks it off, and then it's free for more work.

What I don't get is how this is so. Is the thread handing the operation off to the os and saying "tell me when it's done"? If so, how does the os not get jammed up in the same way as our blocking/5-thread app? It seems like at some point there must be a thing that waits on the operation to complete.

If it helps, here is an example of contrasting behavior: the first is async and parallel, the second blocking and parallel.

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/branweb1
πŸ“…︎ Nov 01 2021
🚨︎ report
Shorter or longer financial data tables - more effective for multi-threaded/API processing?

Simple example - suppose the following data are stored in the following data structure:

Design 1:

Single (but a long) Table: 
e.g. daily financial data from 1960 - 2020 of a portfolio

Design 2:

Multi (but short) Table: 
e.g. each table only has each year's daily financial data of a portfolio

To make things quicker for computation (such as multi-threads, multi-core processing or even API access), generally, would it be better to lump all data into one table (e.g. a dataframe) or split them up into smaller chunks of tables?

Imagine an API attempts to 'GET' values from Design 1; I'd think this would be longer than Design 2? But for multi-threaded, wouldn't Design 2 involve context switching i.e. more cost?

πŸ‘︎ 14
πŸ’¬︎
πŸ‘€︎ u/runnersgo
πŸ“…︎ Sep 14 2021
🚨︎ report
[AskJS] Why don't they make JavaScript multi-threaded?

Why is JavaScript single-threaded?

Wouldn't it be better if they made it multi threaded and allow for real asynchronous code instead of the current event loop stuff we have now?

Are there engineering limitations that prevent this?

πŸ‘︎ 150
πŸ’¬︎
πŸ‘€︎ u/z0cccccc
πŸ“…︎ Jul 01 2021
🚨︎ report
magnetfinder - a multi-threaded CLI torrent aggregator, perfect for Plex & SSH. github.com/bleusakura/mag…
πŸ‘︎ 82
πŸ’¬︎
πŸ‘€︎ u/rredler
πŸ“…︎ Sep 17 2021
🚨︎ report
'Multi-threaded' downloader written by me over several days out of pure spite
πŸ‘︎ 172
πŸ’¬︎
πŸ‘€︎ u/Phpminor
πŸ“…︎ Aug 01 2021
🚨︎ report
New pipelined multi-threaded plotter implementation (work in progress)

https://github.com/madMAx43v3r/chia-plotter

chia-plotter (pipelined multi-threaded)

This is a new implementation of a chia plotter which is desinged as a processing pipeline, similar to how GPUs work, only the "cores" are normal software CPU threads.

As a result this plotter is able to fully max out any storage device's bandwidth, simply by increasing the number of "cores", ie. threads.

Usage

chia_plot <pool_key> <farmer_key> [tmp_dir] [tmp_dir2] [num_threads] [log_num_buckets]

For <pool_key> and <farmer_key> see output of `chia keys show`.
<tmp_dir> needs about 200G space, it will handle about 25% of all writes. (Examples: './', '/mnt/tmp/')
<tmp_dir2> needs about 110G space and ideally is a RAM drive, it will handle about 75% of all writes.
If <tmp_dir> is not specified it defaults to current directory.
If <tmp_dir2> is not specified it defaults to <tmp_dir>.

Make sure to crank up <num_threads> if you have plenty of cores, the default is 4. Depending on the phase more threads will be launched, the setting is just a multiplier.

RAM usage depends on <num_threads> and <log_num_buckets>. With default <log_num_buckets> and 4 threads it's ~2GB, with 16 threads it's ~6GB.

Results

On a dual Xeon(R) E5-2650v2@2.60GHz R720 with 256GB RAM and a 3x800GB SATA SSD RAID0, using a 110G tmpfs for <tmp_dir2>:

Number of Threads: 16
Number of Sort Buckets: 2^7 (128)
Working Directory:   ./
Working Directory 2: ./ram/
[P1] Table 1 took 21.0467 sec
[P1] Table 2 took 152.6 sec, found 4295044959 matches
[P1] Lost 77279 matches due to 32-bit overflow.
[P1] Table 3 took 181.169 sec, found 4295030463 matches
[P1] Lost 62514 matches due to 32-bit overflow.
[P1] Table 4 took 223.303 sec, found 4295044715 matches
[P1] Lost 76928 matches due to 32-bit overflow.
[P1] Table 5 took 232.129 sec, found 4294967739 matches
[P1] Lost 235 matches due to 32-bit overflow.
[P1] Table 6 took 221.468 sec, found 4294932892 matches
[P1] Table 7 took 182.597 sec, found 4294838936 matches
Phase 1 took 1214.37 sec
[P2] max_table_size = 4295044959
[P2] Table 7 scan took 16.9198 sec
[P2] Table 7 rewrite took 44.796 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 47.5287 sec
[P2] Table 6 rewrite took 81.2195 sec, dropped 581301544 entries (13.5346 %)
[P2] Table 5 scan took 46.6094 sec
[P2] Table 5 rewrite took 77.9914 sec, dropped 76
... keep reading on reddit ➑

πŸ‘︎ 128
πŸ’¬︎
πŸ‘€︎ u/HugoMaxwell
πŸ“…︎ May 29 2021
🚨︎ report
Voxel-grids-on-an-LBVH raytracing. GTX 1050Ti at 1080p. Vulkan multi-threaded rendering. v.redd.it/6gbj479opqd71
πŸ‘︎ 148
πŸ’¬︎
πŸ‘€︎ u/too_much_voltage
πŸ“…︎ Jul 27 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.