27 Hilarious Matrix multiplication Puns

[P] Modifying open-sourced matrix multiplication kernel

I've spent the past few months optimizing my matrix multiplication CUDA kernel, and finally got near cuBLAS performance on Tesla T4. In the past few weeks I've been trying to fuse all kinds of operations into the matmul kernel, such as reductions, topk search, masked_fill, and the results are looking pretty good. All of the fused kernels are much faster than the seperated versions while using much less memory.

Runtime of fused MinBMM vs. torch.bmm + torch.min

edit: unit of time in this plot should be seconds, not milliseconds

Runtime of fused TopkBMM vs. torch.bmm + torch.topk

Runtime of fused MBMM vs. torch.bmm + torch.masked_fill

I also wrote a blog post about the motivation, applications and some implementation details of these kernels. The source code can be found in this repo.

👍︎ 192

💬︎

👤︎ u/DeMorrr

📅︎ May 27 2021

🚨︎ report

Matrix multiplication conditions

I've been wondering for a couple years now. Why must the number of columns of the first matrix equal the number of rows of the second? I've been trying to look for an explanation for this but haven't found one. Does anyone know or can explain this to me?

👍︎ 4

💬︎

👤︎ u/Ok-Assumption-9986

📅︎ Jul 07 2021

🚨︎ report

[Matrix Multiplication] Why do the columns of the first matrix need to equal the rows on the second matrix?

First of all, I would like to say that I'm in no way good at math. I try to learn concepts when I need them for something else so I might be missing some prerequisite knowledge.

I'm currently learning about matrix multiplication and I just don't get why you need to transpose B if A is matrix with the shape of (p, m) and B is a matrix with the shape of (n, m).

To me it seems that if the length of the rows (the amount of columns) are the same in both matrices, the multiplication can be done row for row without having to transpose matrix B first.

I tried to multiply two equal matrices with the shape of (3, 4) without transposing the second and came up with the same result as doing it with the transposition.

Is it just a convention to do it the right way or am I missing something in the bigger picture?

👍︎ 3

💬︎

👤︎ u/Destroian

📅︎ Jun 05 2021

🚨︎ report

Matrix multiplication interpretation v.redd.it/s4qem9p77wy61

👍︎ 37

💬︎

👤︎ u/TheCloudTamer

📅︎ May 13 2021

🚨︎ report

ELI5: Why is matrix multiplication important in linear algebra and why is it needed?

I understand basic algebra and how matrix multiplication is done but why is it needed? I can't visualize it or come up with practical examples.

👍︎ 2

💬︎

👤︎ u/PhysicsNinja10

📅︎ Jun 11 2021

🚨︎ report

ArrayFire Matrix Multiplication Vectorization

Hello guys, I am using ArrayFire library for signal processing. I am just curious about how to make more efficient my code. I read vectorization guide in docs, but i just ended up using gfor construct. Is it possible to improve efficiency ? Is there a better way for vectorization ? ( I hope there is :)

Note : I am aiming CUDA performance.

Here is the code which i am trying to improve :

#include &lt;arrayfire.h&gt;
#include &lt;stdio.h&gt;
#include &lt;af/util.h&gt;

static int proc_size = 1024;
static int fft_size  = proc_size * 4;
static int staves    = 288;
static int beams     = 256;

static af::array S;
static af::array B;
static af::array R;

void fn()
{
    gfor ( af::seq i, fft_size )
       R( i , af::span ) = matmul( S( i , af::span ) , B( af::span , af::span , i ) );
}

int main(int, char **)
{
    S = af::randn( fft_size , staves , c32 );

    gfor ( af::seq i, fft_size )
        S( i , af::span ) = af::randn( 1 , staves , c32 );

    B = af::randn( staves , beams , fft_size , af::dtype::c32 );
    R = af::constant( af::cfloat { 0 , 0 } , fft_size , beams );

    try
    {
	af::setDevice( 0 );
        af::info();

        double time = af::timeit(fn);

        printf( "Took %f secs.\n" , time );
    }
    catch (const af::exception &amp;ex)
    {
        fprintf(stderr, "%s\n", ex.what());
        throw;
    }

    return 0;
}

👍︎ 7

💬︎

👤︎ u/ozancansell

📅︎ May 19 2021

🚨︎ report

Why exactly is matrix multiplication the way it is?

Just started recently with it at school.

Multiplying 2 regular numbers makes sense. 2*3 is merely 2 added to itself thrice

Adding 2 regular numbers? Makes sense too. 2+3 is incrementing the value of 2 with three

Matrix addition is also simple. You are basically incrementing the elements of the first matrix by the corresponding elements of the second matrix

But the rules behind matrix multiplication confuse me. Why SHOULD the numbers of columns of first matrix match up with the numbers of the rows of second matrix? Why do we multiply a row of the first matrix with each column of the second matrix? If 2*3 is 2 added to itself thrice, how does that work for matrices? How does one add matrix A to itself matrix B times?

The rules seem very arbitrary to me, and I am not able to understand it. I did search on the internet but a lot of the answers mentioned concepts and terms I dont understand and have not been taught yet like transformation and other things

Can anyone intuitively explain the concept and the rules of matrices multiplication to me? Why exactly is it done the way it is? Is there any specific reason or is it just plain definition?

👍︎ 61

💬︎

👤︎ u/The__BoomBox

📅︎ Mar 19 2021

🚨︎ report

Vector API (JEP 338) Benchmark Results for Matrix Multiplication, Image Convolution, and Image Thresholding. github.com/lessthanoptima…

👍︎ 69

💬︎

👤︎ u/lessthanoptimal

📅︎ Mar 22 2021

🚨︎ report

[Linear Algebra] Matrix multiplication

Given two matrices A and B where A is an (m x n) matrix and B is an (n x p) matrix, with columns [b_1, b_2, ... , b_p], can the product AB be thought of as the matrix with columns

[Ab_1, Ab_2, ... , Ab_p]

👍︎ 3

💬︎

👤︎ u/Milk_No_Titties

📅︎ Jun 01 2021

🚨︎ report

I need to multiply the first row of a matrix by one number, and the other row by another number with matrix multiplication

I have this matrix

https://preview.redd.it/hw9vradmt6y61.png?width=106&format=png&auto=webp&s=bde91083ecca191748710a79ffe88a00776cf681

I need to end up with this matrix, i.e. multiply top row by 12 and bottom row by 20:

https://preview.redd.it/df14i8qst6y61.png?width=124&format=png&auto=webp&s=c34cc26a1d8951f22fb8d148985efc4c3e0663b4

How do i do this with matrix multiplication?

👍︎ 2

💬︎

👤︎ u/hardstuck4ever

📅︎ May 10 2021

🚨︎ report

Matrix Multiplication Inches Closer to Mythic Goal quantamagazine.org/mathem…

👍︎ 25

💬︎

👤︎ u/Sampo

📅︎ Mar 23 2021

🚨︎ report

Auto-vectorization of Matrix-Vector Multiplication

I am struggling to vectorize an inner loop in a matrix vector multiplication. The initial column-wise multiplication works well. However, I need a sparse version where I can skip 16x1 sub-blocks (i.e. the inner loop).

I suppose, I am running in a bunch of bound checks? Due to array_chunks, it should be guaranteed, that the inner loop are contiguous 16x1 blocks. Godbolt: https://rust.godbolt.org/z/fzn6n995d

I also tried a const generics version, which does not produce any assembly. Does anyone have an idea why?

Edit: as u/WafflesAreDangerous suggested adding a main() instantiates it. It generates no vectorized code either, even though there shouldn't be any bound checking since the sizes are known at compile time. https://rust.godbolt.org/z/MEovzc7sc

Edit2: The C version does also not the best job of vectorizing, however it does (obviously) no bounds checking: https://c.godbolt.org/z/4cPeacGqb

👍︎ 20

💬︎

👤︎ u/rikorose

📅︎ Mar 30 2021

🚨︎ report

Comedy gold: Redditor proves that Craig doesn't understand high school level Matrix math by showing he incorrectly thinks Matrix multiplication is commutative. BSVer who's been lying about having credentials in math argues that Craig's math is right. reddit.com/r/bsv/comments…

👍︎ 20

💬︎

👤︎ u/Zectro

📅︎ Mar 31 2021

🚨︎ report

Uh oh ... We've hit a new level , the Matrix level . OctoKitty just invented the Multiplication Potion , curently not for sale ...

👍︎ 12

💬︎

👤︎ u/CequezHatesMortis

📅︎ Feb 28 2021

🚨︎ report

Suppose {v1, v2, ..., vn} is a basis for R^n. Let A be an invertible nxn matrix. Is A*{v1, v2, ..., vn} still a basis for R^n? What if A is not invertible. Does the multiplication still form a basis? Please explain the reasoning.

👍︎ 6

💬︎

👤︎ u/1500Calories

📅︎ Apr 01 2021

🚨︎ report

Pearson VUE Spreadsheet Matrix Multiplication

I'm trying to use the MMULT function on the Pearson spreadsheet, and it will only return one value in once cell instead of an array. I tried pressing control + shift + enter and it still returns one value. I have no problem getting it to work in excel. Any help would be appreciated!

👍︎ 2

💬︎

👤︎ u/Reddituser5253

📅︎ Apr 18 2021

🚨︎ report

[2015 day 10][m4] optimizing by matrix multiplication

I saw this hint in the 2015 day 10 megathread about using Conway's Cosmological Theorem to determine the length of N iterations of the look-and-say algorithm (yes, the same Conway that invented Game of Life that appears in many other AoC puzzles). But if that comment had any code to copy from, I could not find it; it was just the high-level overview. So, I got to re-discover how it worked for myself, and indeed it made a huge difference in performance!

My original implementation did what most solutions did: iterate over every single character to look for the repetitions to produce the string for the next round. And the fact that the string gets ~30% larger each round shows: computing just part 1 took 6.6s, computing part 2 extended the time to 1m30s, and then my code took another 30s just counting the length of the 2 computed strings (2m5s total runtime). Using m4's trace functionality, I determined that my code performed 37,990,946 ifdef(), 38,032,571 ifelse(), 44,980,837 pushdef(), and 8,017,641 incr() calls, plus 2 very costly len(). And trying to expand the solution to 60 iterations would be that much more painful.

But my new version is able to compute answers without ever expanding the string in memory! As Conway noted, you can set up a 92-element recurrence relationship matrix, where each row represents an element (a substring whose expansion will never overlap with the rest of the answer), and each column represents which other elements will be present after the expansion of the element. All the inputs I've seen happen to be 10-

... keep reading on reddit ➡

👍︎ 15

💬︎

👤︎ u/e_blake

📅︎ Mar 28 2021

🚨︎ report

[Year 11: Mathematics] Can anyone explain matrix multiplication between two tensors ?

If a have a 2 tensors say:

[[1,2,3,4,5],[6,7,8,9,10]] = matrix

matrix * matrix = x

How would I multiply this/ how does the process work to find a value for x?

👍︎ 4

💬︎

👤︎ u/OscarBeresford

📅︎ Mar 31 2021

🚨︎ report

Vector API (JEP 338) is about 1.7x faster: Benchmark Results for Matrix Multiplication, Image Convolution, and Image Thresholding. github.com/lessthanoptima…

👍︎ 4

💬︎

👤︎ u/lessthanoptimal

📅︎ Mar 22 2021

🚨︎ report

Tool for visualizing matrix-vector multiplication with color values desmos.com/calculator/gvu…

👍︎ 4

💬︎

👤︎ u/hat-tippe

📅︎ Apr 03 2021

🚨︎ report

Any ideas what could the problem be? I am very bad at math/geometry. This achieved by multiplying and adding distances and direction vectors since transform matrix multiplication is way too hard for me. v.redd.it/uzrs9g2zvnk61

👍︎ 8

💬︎

👤︎ u/thr3rd

📅︎ Mar 02 2021

🚨︎ report

Intuitive way to write out Matrix Multiplication

I was never taught to write out matrix multiplication this way, maybe others have, but I came up with it independently recently when I was getting back into linear algebra. The key is that the top of the first matrix and the left side of the second matrix need to form a *square*. Writing out matrix multiplications like this makes them way more intuitive, including immediately knowing the dimensions of your output matrix, which is really useful in practice. It is also easy to extend for multiplying many matrices in sequence as well as left multiplying. Hopefully this helps you out as much as it did me https://blogs.ams.org/mathgradblog/2015/10/19/matrix-multiplication-easy/

👍︎ 69

💬︎

👤︎ u/disgolf

📅︎ Nov 19 2020

🚨︎ report

Question about Matrix Multiplication

I often wonder about the design of certain conventions in math, why they are the way they are.

For instance, with matrix multiplication, if you have a product of two matrices AB, why did they decide to make it the dot product of B's columns with A's rows?

If you transpose B, then the output is simply the dot product of each corresponding row. Doesn't that seem simpler? Why complicate things?

👍︎ 2

💬︎

👤︎ u/Microwave_on_HIGH

📅︎ Feb 20 2021

🚨︎ report

Matrix Multiplication Inches Closer to Mythic Goal quantamagazine.org/mathem…

👍︎ 3

💬︎

👤︎ u/qznc_bot2

📅︎ Mar 23 2021

🚨︎ report

Matrix Multiplication Inches Closer to Mythic Goal quantamagazine.org/mathem…

👍︎ 3

💬︎

👤︎ u/I_did_dit

📅︎ Mar 23 2021

🚨︎ report

Matrix multiplication

👍︎ 121

💬︎

👤︎ u/peachesonsteam

📅︎ Oct 07 2020

🚨︎ report

[2020 Day 10 part 2] [Python] Number of paths by doing matrix multiplications

Well, it's probably not the most optimal way (especially if you're aiming for performance on bigger inputs), but I think it's pretty neat. The idea is as follows:

The relationship between different adapters can be represented as a directed graph, where the adapters are the nodes and an edge exists from adapter A to B if you can plug B after A, that is, if B - A <= 3.
The problem guarantees that whenever an edge A->B exists, B > A. This implies that we can never go "backwards" in the graph to a lower rated adapter.
The previous point guarantees that any path from 0 to the last adapter is a valid solution.

Then, asking for the number of possible adapter combinations is the same as asking for the number of distinct paths from the node 0 to the node representing the last adapter. This is where adjacency matrices come into play. An adjacency matrix M is a NxN square matrix where N is the number of nodes in the graph. The value for any element M(i, j) in the matrix is 1 if there exists an edge connecting the node i to the node j, and 0 otherwise.

Now, adjacency matrices have a very neat property: if you compute M^k, then M^k(i, j) becomes the number of distinct walks of length k between the nodes i and j. Since we cannot go "backwards", and since we have no cycles in this graph, this is the same as the number of paths, which in turn is the same as the number of valid adapter combinations that include k adapters. We can repeat this process up to the total number of adapters to obtain all possible combinations:

    f = lambda i, j: j &gt; i and lines[j] - lines[i] &lt;= 3
    m = np.fromfunction(np.vectorize(f), (n_lines, n_lines), dtype=int).astype(int)
    aux = np.identity(n_lines)
    
    sol_part_2 = 0
    for _ in range(n_lines):
        aux = aux @ m
        sol_part_2 += aux[0, n_lines - 1]

tl;dr: you can solve day 10 part 2 by building an adjacency matrix, multiplying it by iself N times and adding the (0, N) element where N is the number of adapters.

👍︎ 13

💬︎

👤︎ u/Coffee_Doggo

📅︎ Dec 11 2020

🚨︎ report