A list of puns related to "Query optimization"
Hello everyone,
So, I have been tasked with automating some reports for my organization. I'm relatively new to DAX so I request help with this one. Apologies if this goes very long.
I have a sales table like this:
1000s of products with lakhs of rows
And a stock table like this:
https://preview.redd.it/1pwawr6mt2c81.png?width=712&format=png&auto=webp&s=8c52ee71151e2ea7d5f32cc476dffb903374026c
I need to prepare a top 30 sales report as the main report. And for reconciliation, the rest of the products should be numbered as 31 and named as Others. So what I did is:
The issue is, I need a subsidiary report with the top 5 supplier sales of the top 30 units along with the stock present with the respective suppliers. Below image for reference:
https://preview.redd.it/0ecubk5qt2c81.png?width=655&format=png&auto=webp&s=7a34ea43a70a83d6f8cb3d2ce79a277cc90b96e1
I took the first two columns from Index Table and the rest from the linked tables. However, I couldn't get the stock values as intended. It cites something related to relationships. I couldn't create any other relationship related to the supplier due to many duplicates present under "Others".
It's okay if this report doesn't have the 31st product. Top 30 is fine. However, I don't want to create a separate file for this altogether.
I want to know whether I could write any DAX measure to get the stock values. Also, is there any way I could optimize the way I arrived at the top 30 products?
Thank you.
I am trying to understand if there is a more performant way to achieve this outcome. I have a table (called SRC.Prescription_Fills) containing 1,954,935,946 records, all bogus data. I partitioned the table so each month and year is within its partition.
I want to get a record count for each year and month. I created the following query to generate this output.
SELECT [Month] AS [Month],
[2014], [2015], [2016], [2017], [2018], [2019], [2020]
FROM
(
SELECT
YEAR(Fill_Date_Time) AS [Year],
MONTH(Fill_Date_Time) AS [Month],
Fill_Identifier
FROM
SRC.Prescription_Fills WITH (NOLOCK)
) AS Prescription_Fills
PIVOT
(
COUNT_BIG(Fill_Identifier)
FOR [Year] IN ([2014], [2015], [2016], [2017], [2018], [2019], [2020])
) AS PVT
ORDER BY
[Month];
The query plan for this query is https://www.brentozar.com/pastetheplan/?id=B1j0YbG9K.
The output of this query looks like this:
https://preview.redd.it/4865dh6nfw481.png?width=805&format=png&auto=webp&s=acd49703a93fb321bf287bd0bde033eb72110cf6
This query takes up to 16 minutes to run. Any ideas of what I can do to improve it?
WITH release_cte AS
(SELECT release_nbr,
base_div_name,
country_code,
house_nbr,
xref_count,
effective_release_date,
ROW_NUMBER() OVER (PARTITION BY release_nbr
ORDER BY create_ts ASC) AS row_nbr
FROM status_table
WHERE create_ts >= '2021-03-27 18:43:50.307'
AND house_nbr=32612
AND country_code='US'
AND process_status_code in (16, 4096)
AND release_nbr >= 0
AND release_nbr NOT in
(
SELECT DISTINCT release_nbr
FROM status_table
WHERE create_ts >= '2021-03-27 18:43:50.307'
AND house_nbr=32612
AND country_code='US'
AND item_xref_id= -1
)
)
SELECT release_nbr,
base_div_name,
country_code,
house_nbr,
xref_count,
effective_release_date,
MAX(row_nbr) AS row_nbr
FROM release_cte
GROUP BY release_nbr,
base_div_name,
country_code,
house_nbr,
xref_count,
effective_release_date;
Hello everyone!
I'm newbie to MongoDB and I'm trying to do some query profiling and indexes creation on an Amazon dataset, for University project purpose. I have a doubt about indexing I couldn't clarify, hoping that this is not too trivial:
How can I choose a good index to speed up a very slow $lookup query? I can create indexes on both collections, on fields being queried, but is this really useful?
Hi r/SQL,
Iβm in a bit of uncharted waters currently. Iβve recently changed companies, and the amount of data I sort through has gone from localized servers for individual clients, to a full blown data warehouse with billions of rows in each and all tables. (MSP->large client)
The ad hoc report Iβve been working on is not difficult or fancy. However, Iβm having to reference and join to about 10 tables with an astounding (To me) amount of data.
My question: How do I tackle this? This simple query is taking 2-3 hours to run, and even breaking it down further into individual selects with simple conditions is taking an hour to run individually. (Ex. Select X from Y where;)
Do I need to just run these queries off the clock or on a weekend? Any solutions I could try or that youβd recommend?
Edit: asked my boss the same question and he hit me with βWelcome to my worldβ hahaha
Title says it all! I need good resources on both topics
Iβm interested to understand which job profile handles database related issues like query optimization, query performance tuning etc. in large organisations?
I have two tables,
BookingMetaData and BookingDetails (MySql)
Both are huge tables.
So if i do something like
SELECT * FROM BookingMetaData bm INNER JOIN BookingDetails bd on bm.id = bd.id WHERE bm.id > 5M ; (Assuming currently there are slightly more than 5M records)
or instead of WHERE,
If I put ORDER BY bm.id DESC limit 100;
then,
Will MySql try to join the tables first (5M records) and then filter, or will it be able to do some optimisation after merging a few records in some binary search way.
If not, how can I do such operation efficiently. ( I am not allowed to change the tables)
Any help is greatly appreciated. Thank you
I am trying to figure out which one of these queries would be much faster (has less time complexity) and why.
Question 1: I am trying to find the top 10 distances.
Query 1
SELECT
user_id,
distance
FROM
travel_table
ORDER BY
distance
LIMIT 10;
Query 2
SELECT
user_id,
distance,
rank() OVER (partition by distnace DESC)
FROM
travel_table
WHERE
rank < 10;
Question 2: Finding top users
Query 1: Subquery
SELECT
name,
total_distance
FROM
(
SELECT
user_id,
SUM(distance) AS total_distance
FROM
lyft_rides_log
GROUP BY
user_id
ORDER BY
total_distance DESC
LIMIT 10
) a
INNER JOIN
lyft_users b
ON
a.user_id = b.id;
Query 2: Inner join
SELECT
name,
SUM(distance) as total_distance
FROM
lyft_rides_log r
INNER JOIN
lyft_users u
ON r.user_id = u.id
GROUP BY
name
ORDER By
total_distance DESC
LIMIT 10
I am trying to remove duplicate rows from my select query. By duplicate I mean , if the the 3 column matches ,
I want to remove those records from selected records.
I am trying like this :
>SELECT column1 , column2 , column3,column4 , column5,column6
>FROM TableA
>WHERE NOT EXIST (SELECT 1 from FROM TableB where TableB.columnX= TableA.column1 AND TableB.columnY=
>TableA.column2 AND TableB.columnZ= TableA.column3 )
By duplicate I mean , if the the above 3 column matchs , I want to remove from selected records.
This is not giving results.
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.