Thursday, 14 November 2024

SQL Interview Questions

Basic SQL Interview Questions

What is SQL, and what is it used for?
What are the different types of SQL commands?
- Discuss DDL, DML, DCL, TCL, and DQL with examples.
What is the difference between DELETE and TRUNCATE?
Explain the concept of a primary key and a foreign key.
What are joins? Name and explain different types of joins.
- E.g., INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, CROSS JOIN.
What is a constraint in SQL?
- Explain common constraints like NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, and DEFAULT.
What are indexes? Why are they used?
Explain normalization and the normal forms.
What is a view? Why are views used?
How do GROUP BY and HAVING clauses work?

Intermediate SQL Interview Questions

What is the difference between WHERE and HAVING clauses?
Explain the concept of a subquery.
- Differentiate between correlated and non-correlated subqueries.
How does UNION differ from UNION ALL?
What are aggregate functions in SQL?
- Discuss functions like SUM, COUNT, AVG, MAX, and MIN.
What are window functions?
- Discuss functions like ROW_NUMBER, RANK, DENSE_RANK, NTILE, and how they work with PARTITION BY.
What is a stored procedure? Why are they used?
Explain transactions and ACID properties.
What is a self-join, and when would you use it?
How would you find duplicate records in a table?
What is a CROSS APPLY and OUTER APPLY?

Advanced SQL Interview Questions

Explain CTE (Common Table Expressions) and recursive CTE.
What is a temporary table, and how does it differ from a CTE?
How do you optimize a slow query?
- Discuss using indexes, analyzing query execution plans, and avoiding unnecessary subqueries.
What is a materialized view, and how does it differ from a regular view?
Explain database partitioning and its types.
What are JSON and XML data types in SQL, and how can you query them?
How would you handle data migration between two databases?
Describe the difference between horizontal and vertical scaling in databases.
What is sharding in SQL databases, and how does it work?
What are OLTP and OLAP systems? How do they differ?

Scenario-Based SQL Questions

How would you write a query to find the second-highest salary in an employee table?
If you have a table with employee data and you want to get each department's highest-paid employee, how would you do it?
Write a query to retrieve records where a column value is within a specific range.
How would you find the employees who have not submitted a report for a particular month?
How would you design a database schema for an e-commerce system?

SQL Query Practice

Given a table of orders, write a query to count the number of orders placed by each customer.
Write a query to find customers who made purchases in both January and February.
How would you retrieve the top 3 highest scores for each student in an exam table?
Write a query to convert rows to columns (or vice versa) using SQL.
How would you handle duplicate data entry in a production database?

Answers

Basic SQL Interview Questions

1. What is SQL, and what is it used for?

Answer: SQL (Structured Query Language) is a programming language used to manage and manipulate relational databases. It allows users to insert, update, delete, and retrieve data stored in a database.

2. What are the different types of SQL commands?

Answer: SQL commands are categorized as:
- DDL (Data Definition Language): Commands like CREATE, ALTER, DROP, TRUNCATE for defining database structures.
- DML (Data Manipulation Language): Commands like INSERT, UPDATE, DELETE to manipulate data.
- DCL (Data Control Language): Commands like GRANT, REVOKE to control access.
- TCL (Transaction Control Language): Commands like COMMIT, ROLLBACK, SAVEPOINT to manage transactions.
- DQL (Data Query Language): Commands like SELECT to query data.

3. What is the difference between `DELETE` and `TRUNCATE`?

Answer: DELETE removes rows one at a time and can include a WHERE clause. TRUNCATE removes all rows in a table instantly and is generally faster but cannot be rolled back in some databases.

DELETE FROM employees WHERE department = 'Sales';
TRUNCATE TABLE employees;

4. Explain the concept of a primary key and a foreign key.

Answer: A primary key uniquely identifies each record in a table, while a foreign key links two tables, referring to the primary key in another table.

5. What are joins? Name and explain different types of joins.

Answer: Joins combine rows from two or more tables based on a related column. Types:
- INNER JOIN: Only returns rows with matching values in both tables.
- LEFT JOIN: Returns all rows from the left table and matching rows from the right.
- RIGHT JOIN: Returns all rows from the right table and matching rows from the left.
- FULL OUTER JOIN: Returns all rows when there’s a match in either table.

SELECT a.name, b.department
FROM employees a
INNER JOIN departments b ON a.department_id = b.id;

6. What is a constraint in SQL?

Answer: Constraints enforce rules on columns. Common constraints include:
- NOT NULL - Ensures a column cannot contain null values.
- UNIQUE - Ensures all values in a column are unique.
- PRIMARY KEY - Uniquely identifies each row in a table.
- FOREIGN KEY - Links rows in two tables.
- CHECK - Validates values in a column based on a condition.
- DEFAULT - Assigns a default value if no value is specified.

7. What are indexes? Why are they used?

Answer: Indexes are database objects that improve data retrieval speed. They work like pointers to data and are especially useful on frequently queried columns.

8. Explain normalization and the normal forms.

Answer: Normalization organizes data to reduce redundancy. The normal forms are:
- 1NF: Each cell holds a single value.
- 2NF: 1NF + Every non-key column depends on the whole primary key.
- 3NF: 2NF + No transitive dependencies between columns.

9. What is a view? Why are views used?

Answer: A view is a virtual table based on a query. It simplifies complex queries, improves security, and enhances maintainability.

CREATE VIEW SalesView AS
SELECT customer_id, SUM(total_amount) AS TotalSales
FROM Orders
GROUP BY customer_id;

10. How do `GROUP BY` and `HAVING` clauses work?

Answer: GROUP BY groups rows sharing a column value, allowing aggregate functions on groups. HAVING filters results of groups after GROUP BY.

SELECT department, COUNT(employee_id) AS TotalEmployees
FROM employees
GROUP BY department
HAVING COUNT(employee_id) > 10;

Intermediate SQL Interview Questions

1. What is the difference between `WHERE` and `HAVING` clauses?

Answer: WHERE filters rows before grouping; HAVING filters groups after grouping.

2. Explain the concept of a subquery.

Answer: A subquery is a query within a query. It can be used in conditions like IN, EXISTS, and as a source of data.

SELECT name FROM employees
WHERE department_id IN (SELECT id FROM departments WHERE location = 'NY');

3. How does `UNION` differ from `UNION ALL`?

Answer: UNION removes duplicates, while UNION ALL keeps all records, including duplicates.

4. What are aggregate functions in SQL?

Answer: Aggregate functions perform calculations on sets of rows. Common functions include:
- SUM, COUNT, AVG, MAX, MIN.

5. What are window functions?

Answer: Window functions compute values across a set of rows related to the current row. Examples include ROW_NUMBER, RANK, DENSE_RANK, and NTILE.

SELECT name, department,
       RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS Rank
FROM employees;

6. What is a stored procedure? Why are they used?

Answer: A stored procedure is a saved SQL code block that can be reused. It encapsulates logic, reducing repetitive code and enhancing security.

7. Explain transactions and ACID properties.

Answer: Transactions are units of work ensuring data integrity. ACID properties:
- Atomicity, Consistency, Isolation, Durability.

8. What is a self-join, and when would you use it?

Answer: A self-join joins a table to itself. Useful for finding hierarchical or related data within the same table.


SELECT e1.name AS Employee, e2.name AS Manager
FROM employees e1
INNER JOIN employees e2 ON e1.manager_id = e2.id;

9. How would you find duplicate records in a table?

Answer: Use GROUP BY with HAVING to filter duplicates.

SELECT employee_id, COUNT(*)
FROM employees
GROUP BY employee_id
HAVING COUNT(*) > 1;

10. What is a `CROSS APPLY` and `OUTER APPLY`?

Answer: These are used with table-valued functions in SQL Server:
- CROSS APPLY works like INNER JOIN, returning rows where the function has results.
- OUTER APPLY works like LEFT JOIN, returning all rows from the outer table.

Advanced SQL Interview Questions

1. Explain CTE (Common Table Expressions) and recursive CTE.

Answer: A CTE simplifies complex queries by defining a temporary result set. Recursive CTEs allow queries that reference themselves.

WITH EmployeeHierarchy AS (
    SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.id, e.name, e.manager_id
    FROM employees e
    INNER JOIN EmployeeHierarchy eh ON e.manager_id = eh.id
)
SELECT * FROM EmployeeHierarchy;

2. What is a temporary table, and how does it differ from a CTE?

Answer: Temporary tables exist in a session; CTEs are only available within a single query execution.

3. How do you optimize a slow query?

Answer: Key steps:
- Add indexes.
- Use EXPLAIN to understand the query plan.
- Avoid unnecessary subqueries and calculations in WHERE.

4. What is a materialized view, and how does it differ from a regular view?

Answer: A materialized view stores query results, updating periodically, unlike a regular view, which recalculates on each query.

5. Explain database partitioning and its types.

Answer: Partitioning divides tables into parts for faster access. Types include range, list, hash, and composite.

6. What are JSON and XML data types in SQL, and how can you query them?

Answer: JSON/XML types store semi-structured data, queryable with functions like JSON_VALUE.

Scenario-Based SQL Questions

1. Find the second-highest salary:

SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);

2. Find highest-paid employee in each department:

SELECT department_id, name, salary
FROM (
    SELECT department_id, name, salary,
           RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
    FROM employees
) WHERE rank = 1;

Advanced SQL Interview Questions
1. What are stored procedures and functions?
Explain the differences and when you would use each. 
Answer: Stored procedures and functions are
     reusable SQL code blocks that can encapsulate business logic.
Stored Procedures: Used for performing a set
      of actions or changes. They don’t necessarily return a value
      and can contain multiple INSERT, UPDATE, and DELETE statements.
Functions: Return a single value or table and
      are generally used for computations or returning derived data.
Functions cannot modify database state (e.g., INSERT,DELETE).
CREATE FUNCTION CalculateTax(@amount DECIMAL) RETURNS DECIMAL AS
BEGIN
    RETURN @amount * 0.1;
END;
 
CREATE PROCEDURE UpdateSalary
AS
BEGIN
    UPDATE employees SET salary = salary * 1.1 WHERE department = 'Sales';
END;
2. What is a cursor in SQL? When would you use it,
and what are the drawbacks?
Answer: Cursors allow row-by-row processing of
 query results, useful for operations that can’t be easily done with
 set-based logic. However, they are slower and use more resources compared
 to set-based queries.
3. Explain indexing strategies. What are clustered
and non-clustered indexes, and how do they impact query performance?
Answer: Indexing strategies improve query performance by reducing data access time.
  Clustered Index: Sorts the data in the table
      itself, only one allowed per table.
Non-Clustered Index: Creates a separate
    structure for the index, pointing to the actual data. Multiple
    non-clustered indexes are allowed.
4. What are some common query optimization techniques?
Answer: Some techniques include:  
Using
      appropriate indexes on frequently filtered columns.
Avoiding
      SELECT *; instead,
      specifying required columns.
Replacing
      cursors with joins or set-based operations when possible.
Writing
 SARGable (Search ARGument Able) queries, avoiding functions
             in WHERE clauses.
Analyzing
      the query execution plan to identify bottlenecks.
5. Explain the concept of database isolation
levels and their impact on transactions.
Answer: Isolation levels determine how
     transaction visibility is controlled and include:READ UNCOMMITTED – Allows dirty reads.
READ COMMITTED – Default level; prevents dirty reads.

REPEATABLE READ – Prevents non-repeatable reads.

SERIALIZABLE – Prevents phantom reads, highest level of isolation.
6. What is the difference between EXISTS and IN? When should you use one over the
other?
Answer: EXISTS
     is generally faster for subqueries with large datasets, especially
     if the outer query has a smaller number of rows. IN
     is more readable for smaller datasets or constant sets.
7. Explain the concept of row-level and table-level locks in SQL.
Answer: Row-level locks allow more granular
 control, locking only the specific rows in use. Table-level locks lock the
 entire table, affecting performance but simplifying concurrency control.




Complex SQL Query Examples
1. Write a query to find employees with salaries above the department average. 
Answer: This uses a correlated subquery to
     compare each employee's salary to their department's average.
SELECT name, department, salary
FROM employees e
WHERE salary > (SELECT AVG(salary)
                FROM employees
                WHERE department = e.department);
2. Retrieve the 3rd highest salary in each department.
Answer: Use DENSE_RANK over a partition by department.
SELECT department, name, salary
FROM (
    SELECT department, name, salary,
         DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
    FROM employees
) AS ranked
WHERE salary_rank = 3;

3. Write a query to pivot a table to show product sales per region.
 
Answer: Use CASE or pivot functions to convert rows into columns.
SELECT region,
       SUM(CASE WHEN product = 'ProductA' THEN sales ELSE 0 END) AS ProductA,
       SUM(CASE WHEN product = 'ProductB' THEN sales ELSE 0 END) AS ProductB,
       SUM(CASE WHEN product = 'ProductC' THEN sales ELSE 0 END) AS ProductC
FROM sales
GROUP BY region;

4. Find customers who made purchases in consecutive months.
Answer: Use LAG
     to check if a purchase occurred in the month after the previous one.
SELECT customer_id, purchase_month
FROM (
    SELECT customer_id, purchase_month,
           LAG(purchase_month, 1) OVER (PARTITION BY customer_id ORDER BY purchase_month) 
                        AS prev_month
    FROM purchases
) AS purchases_with_lag
WHERE purchase_month = DATEADD(month, 1, prev_month);

5. Query to calculate cumulative sales for each
day.
Answer: Use SUM as a window function with OVER.
SELECT date, sales,
       SUM(sales) OVER (ORDER BY date) AS cumulative_sales
FROM daily_sales;
6. Write a query to identify and delete duplicate
rows in a table.
Answer: This example uses CTEs and ROW_NUMBER to mark duplicates.
WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num
    FROM table_name
)
DELETE FROM CTE WHERE row_num > 1;
7. Write a query to split a delimited string into multiple rows. 
Answer: Some databases, like SQL Server, have STRING_SPLIT functions. Alternatively,
   recursive CTEs can be used.WITH CTE AS (
    SELECT id, LEFT(column_name, CHARINDEX(',', column_name + ',') - 1) AS split_value,
           RIGHT(column_name, LEN(column_name) - CHARINDEX(',', column_name + ',')) 
                                        AS remaining_string
    FROM table_name
    UNION ALL
    SELECT id, LEFT(remaining_string, CHARINDEX(',', remaining_string + ',') - 1),
      RIGHT(remaining_string, LEN(remaining_string) - CHARINDEX(',', remaining_string + ','))
    FROM CTE
    WHERE remaining_string <> ''
)
SELECT id, split_value FROM CTE WHERE split_value <> '';

8. Identify gaps in a sequence of dates (e.g., missing dates in attendance records).
 Answer: Use a LEFT JOIN on a sequence of dates.
SELECT d.date
FROM (
    SELECT MIN(date) AS start_date, MAX(date) AS end_date FROM attendance
) AS date_range
CROSS JOIN 
GENERATE_SERIES(date_range.start_date, date_range.end_date, INTERVAL '1 day') AS d(date)
LEFT JOIN attendance a ON a.date = d.date
WHERE a.date IS NULL;

Wednesday, 13 November 2024

Power BI Interview Questions

General Power BI Concepts

Explain the difference between Power BI Desktop, Power BI Service, and Power BI Report Server.
What are the different types of filters in Power BI, and how do they differ?
How does Power BI handle data refresh, and what are the main refresh types available?
Can you explain the difference between Import Mode, DirectQuery, and Live Connection in Power BI?
What is row-level security (RLS), and how do you implement it in Power BI?
How do you handle large datasets in Power BI, and what techniques can optimize performance?

Data Modelling and DAX

Explain the importance of star schema in Power BI data modelling. When would you consider a snowflake schema?
What is a calculated column vs. a measure, and when would you use each?
Describe DAX functions you commonly use. Can you explain FILTER, CALCULATE, and RELATED functions?
How would you write a DAX formula to get a running total?
Explain context transition in DAX. How does it affect the way functions like CALCULATE behave?
Can you describe what ALL and ALLEXCEPT do in DAX? Provide an example of when you’d use them.
How do you handle date and time calculations in Power BI? What are common DAX functions used in time intelligence?

Advanced Power BI Visualizations and Reporting

What are the ways to create drill-through or drill-down functionality in Power BI?
Explain how you’d create dynamic titles for visuals based on slicer selections.
What options do you have to enable user interactions in a Power BI report (e.g., bookmarks, tooltips)?
How would you handle conditional formatting in Power BI? Can you give examples of using conditional formatting for values, colors, etc.?
Explain the role of bookmarks in Power BI. How would you use them to create interactive storytelling?
How can you display only selected filtered values in a bar graph without dimming the others (like you wanted in one of your projects)?

Power BI Performance Optimization

What are some best practices for optimizing Power BI report performance?
How would you optimize a Power BI model with millions of rows for quick filtering and aggregation?
What are composite models, and how do they help in performance tuning?
Explain query folding. How does it impact performance, and what are the best practices for maintaining query folding?
When would you use aggregations in Power BI, and how would you set them up?

Power BI and Integration with Other Tools

How do you set up Power BI integration with external databases like SQL Server or MySQL?
Describe your experience connecting Power BI to Snowflake. How do you manage data connectivity and refreshes for large datasets?
Can you explain how you would publish Power BI reports to SharePoint, Teams, or embed them in an application?
How do you handle version control with Power BI reports, especially if you work with a team?

Scenario-Based and Troubleshooting

Describe a complex report/dashboard you’ve built. What were the challenges, and how did you resolve them?
How would you troubleshoot performance issues in Power BI reports?
Imagine your report shows unexpected results. What steps would you take to investigate and resolve the issue?
How would you handle a requirement where different users need to see different data in the same report?

Security and Deployment

What are the different ways to implement security in Power BI?
How do you implement row-level security for multiple tables in Power BI?
How do you ensure data governance and security when sharing reports within and outside your organization?

Power BI in Real-World Projects

Describe a project where you used Power BI to provide critical insights. How did it impact the business?
How do you approach building a dashboard for stakeholders who may have little knowledge of the data?
Tell us about a time when you used Power BI in a microfinance or banking-related project (if relevant). How did you address specific industry challenges?
What strategies would you use to create a scalable Power BI environment for a large enterprise?

Analytical and Problem-Solving Scenarios

If given a dataset, how would you approach calculating key metrics like Monthly Recurring Revenue (MRR) or Customer Retention Rate?
What steps would you take to design a report that shows year-over-year growth while accounting for seasonality?
Describe a scenario where you used Power BI for predictive analysis. How did you approach the problem?

Answers

General Power BI Concepts

Difference between Power BI Desktop, Power BI Service, and Power BI Report Server:
- Power BI Desktop is the development tool where you can build reports and dashboards on your local machine.
- Power BI Service (cloud-based) is where you publish, share, and collaborate on reports with others in your organization.
- Power BI Report Server is an on-premises solution for those who need to keep their data and reports on internal servers.
Types of filters in Power BI:
- Visual-level filters: Apply to a single visual.
- Page-level filters: Apply to all visuals on a report page.
- Report-level filters: Apply across all pages in the report.
Data refresh types:
- Manual Refresh: Refresh data manually on Power BI Desktop or Service.
- Scheduled Refresh: Schedule automatic refreshes in Power BI Service.
- DirectQuery: Data refreshes in real-time when the report is loaded.
- Live Connection: Always connected to the source, no need for scheduled refreshes.
Difference between Import Mode, DirectQuery, and Live Connection:
- Import Mode: Loads data into Power BI’s in-memory engine for faster performance.
- DirectQuery: Keeps data in the source database, so Power BI sends queries each time you interact.
- Live Connection: Used for live analysis of models in tools like SSAS.
Row-Level Security (RLS):
- Use Manage roles in Power BI Desktop, create roles with DAX filters, publish to Power BI Service, and assign users to these roles.
Handling large datasets in Power BI:
- Use aggregation tables, data reduction techniques (filters, removing unnecessary columns), and optimized data modeling (star schema).
- DirectQuery can help if the dataset is too large for Import Mode.

Data Modeling and DAX

Importance of star schema in Power BI:
- Star schema simplifies relationships and enables faster performance. Snowflake schema may be used if normalization is critical, though it's generally slower for querying.
Calculated column vs. measure:
- Calculated columns are computed row by row when the data model loads, used in row-level filtering or grouping.
- Measures are calculations done on the fly, typically in the context of a visual.
Common DAX functions - FILTER, CALCULATE, RELATED:
- FILTER: Filters a table based on a condition.
```
DAX
FILTER(Sales, Sales[Region] = "North America")
```
- CALCULATE: Changes context for calculation.
```
DAX
CALCULATE(SUM(Sales[Amount]), FILTER(Sales, Sales[Region] = "North America"))
```
- RELATED: Retrieves values from related tables.
```
DAX
RELATED(Products[ProductName])
```

Running total DAX formula:

RunningTotal = 
CALCULATE(
    SUM(Sales[Amount]),
    FILTER(
        ALL(Sales),
        Sales[Date] <= MAX(Sales[Date])
    )
)

Context transition in DAX:
- When a row context (filtering by row) is transformed into a filter context (filtering by table), typically with CALCULATE or CALCULATETABLE.

Using ALL and ALLEXCEPT:

ALL removes all filters in a table.

DAX
TotalSales = CALCULATE(SUM(Sales[Amount]), ALL(Sales))

ALLEXCEPT removes all filters except specified columns.

DAX
CALCULATE(SUM(Sales[Amount]), ALLEXCEPT(Sales, Sales[Product]))

Date and time calculations:
- Common functions include TOTALYTD, TOTALQTD, and SAMEPERIODLASTYEAR to manage time-based calculations.
```
DAX
SalesLastYear = CALCULATE(SUM(Sales[Amount]), SAMEPERIODLASTYEAR(Dates[Date]))
```

Advanced Power BI Visualizations and Reporting

Creating drill-through or drill-down:
- Drill-down: Enable data hierarchy in visuals.
- Drill-through: Create drill-through pages with specific filter conditions.
Dynamic titles based on slicers:
- Use DAX to create a measure for the title, then set the title as dynamic text:
```
DAX
Title = "Sales Report for " & SELECTEDVALUE(Regions[Region])
```
User interactions (bookmarks, tooltips):
- Bookmarks: Capture report states for navigation.
- Tooltips: Customize visual tooltips to display additional information.
Conditional formatting examples:
- Format based on value ranges or conditions using fields or DAX measures.
Bookmarks for storytelling:
- Create bookmarks for different stages of analysis and toggle them with buttons for a story-like flow.
Display only selected values without dimming:
- Use custom visuals or conditional formatting to control visibility.

Power BI Performance Optimization

Performance optimization best practices:
- Use aggregations, remove unused columns, reduce model complexity, and optimize DAX.
Optimize model with millions of rows:
- Use incremental refresh, DirectQuery for real-time, and aggregated tables to reduce row counts.
Composite models:
- Allows combining DirectQuery and Import modes, optimizing for real-time and historical data.
Query folding:
- Ensures transformations happen at the source rather than in Power BI, improving speed and efficiency.
Setting up aggregations:
- Set aggregated tables at coarser grain (e.g., monthly instead of daily) for improved performance.

Power BI and Integration with Other Tools

External database integration:
- Use Power BI connectors and configure gateway connections for on-premises sources.
Power BI with Snowflake:
- Direct integration with Power BI connector for Snowflake, managing connectivity and refresh scheduling.
Publishing to SharePoint, Teams, or embedding:
- Use Power BI embed options or the Power BI web part for SharePoint and Teams.
Version control for Power BI:
- Use OneDrive or SharePoint to manage report versions, and use documentation for version tracking.

Scenario-Based and Troubleshooting

Complex report/dashboard example:
- Describe a project with custom visuals, interactive elements, and data modeling challenges.
Troubleshooting performance issues:
- Use Performance Analyzer, DAX Studio, and optimize query folding and model design.
Unexpected results in reports:
- Validate DAX logic, check relationships, and ensure no unintended filter context.
Different data for different users:
- Implement Row-Level Security with user-based filtering.

Security and Deployment

Different security implementations:
- RLS, OLS (Object-Level Security), and Data Classification in Power BI Service.
RLS for multiple tables:
- Apply DAX filters across relationships or use bridge tables for security filtering.
Data governance and security:
- Use RLS, secure data gateways, and data encryption for secure sharing.

Power BI in Real-World Projects

Project example with impact on business:
- Describe how the insights from the Power BI report led to data-driven decisions.
Approaching a dashboard for non-data users:
- Use simplified visuals and clear labels, and avoid technical terms.
Microfinance or banking project example:
- Discuss the dashboards you created, such as those for tracking approvals and loan statuses.
Creating a scalable Power BI environment:
- Focus on Power BI Premium, incremental refresh, and workspaces for organization-wide access.

Analytical and Problem-Solving Scenarios

Calculating MRR or Retention Rate:
- Use DAX to calculate recurring revenue and retention based on customer data.
Year-over-year growth with seasonality:
- Use SAMEPERIODLASTYEAR and other time-intelligence functions.
Predictive analysis:
- Describe use cases with Machine Learning or Python/R integration for advanced predictive analysis.

Thursday, 14 November 2024

SQL Interview Questions

Basic SQL Interview Questions

Intermediate SQL Interview Questions

Advanced SQL Interview Questions

Scenario-Based SQL Questions

SQL Query Practice

Answers

Basic SQL Interview Questions

1. What is SQL, and what is it used for?

2. What are the different types of SQL commands?

3. What is the difference between DELETE and TRUNCATE?

4. Explain the concept of a primary key and a foreign key.

5. What are joins? Name and explain different types of joins.

6. What is a constraint in SQL?

7. What are indexes? Why are they used?

8. Explain normalization and the normal forms.

9. What is a view? Why are views used?

10. How do GROUP BY and HAVING clauses work?

Intermediate SQL Interview Questions

1. What is the difference between WHERE and HAVING clauses?

2. Explain the concept of a subquery.

3. How does UNION differ from UNION ALL?

4. What are aggregate functions in SQL?

5. What are window functions?

6. What is a stored procedure? Why are they used?

7. Explain transactions and ACID properties.

8. What is a self-join, and when would you use it?

9. How would you find duplicate records in a table?

10. What is a CROSS APPLY and OUTER APPLY?

Advanced SQL Interview Questions

1. Explain CTE (Common Table Expressions) and recursive CTE.

2. What is a temporary table, and how does it differ from a CTE?

3. How do you optimize a slow query?

4. What is a materialized view, and how does it differ from a regular view?

5. Explain database partitioning and its types.

6. What are JSON and XML data types in SQL, and how can you query them?

Scenario-Based SQL Questions

1. Find the second-highest salary:

2. Find highest-paid employee in each department:

Advanced SQL Interview Questions

1. What are stored procedures and functions? Explain the differences and when you would use each.

2. What is a cursor in SQL? When would you use it, and what are the drawbacks?

Answer: Cursors allow row-by-row processing of query results, useful for operations that can’t be easily done with set-based logic. However, they are slower and use more resources compared to set-based queries.

3. Explain indexing strategies. What are clustered and non-clustered indexes, and how do they impact query performance?

Answer: Indexing strategies improve query performance by reducing data access time.

Clustered Index: Sorts the data in the table itself, only one allowed per table.Non-Clustered Index: Creates a separate structure for the index, pointing to the actual data. Multiple non-clustered indexes are allowed.

4. What are some common query optimization techniques?

Answer: Some techniques include:

Using appropriate indexes on frequently filtered columns.Avoiding SELECT *; instead, specifying required columns.Replacing cursors with joins or set-based operations when possible.Writing SARGable (Search ARGument Able) queries, avoiding functions

in WHERE clauses.Analyzing the query execution plan to identify bottlenecks.

5. Explain the concept of database isolation levels and their impact on transactions.

6. What is the difference between EXISTS and IN? When should you use one over the other?

Answer: EXISTS is generally faster for subqueries with large datasets, especially if the outer query has a smaller number of rows. IN is more readable for smaller datasets or constant sets.

7. Explain the concept of row-level and table-level locks in SQL.

Answer: Row-level locks allow more granular control, locking only the specific rows in use. Table-level locks lock the entire table, affecting performance but simplifying concurrency control.

Complex SQL Query Examples

1. Write a query to find employees with salaries above the department average.

Answer: This uses a correlated subquery to compare each employee's salary to their department's average.SELECT name, department, salaryFROM employees eWHERE salary > (SELECT AVG(salary) FROM employees WHERE department = e.department);

2. Retrieve the 3rd highest salary in each department.

Answer: Use DENSE_RANK over a partition by department.SELECT department, name, salaryFROM ( SELECT department, name, salary, DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank FROM employees) AS rankedWHERE salary_rank = 3;

3. Write a query to pivot a table to show product sales per region.

4. Find customers who made purchases in consecutive months.

5. Query to calculate cumulative sales for each day.

Answer: Use SUM as a window function with OVER.

SELECT date, sales, SUM(sales) OVER (ORDER BY date) AS cumulative_salesFROM daily_sales;

6. Write a query to identify and delete duplicate rows in a table.

Answer: This example uses CTEs and ROW_NUMBER to mark duplicates.

WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num FROM table_name)DELETE FROM CTE WHERE row_num > 1;

7. Write a query to split a delimited string into multiple rows.

8. Identify gaps in a sequence of dates (e.g., missing dates in attendance records).

Wednesday, 13 November 2024

Power BI Interview Questions

General Power BI Concepts

Data Modelling and DAX

Advanced Power BI Visualizations and Reporting

Power BI Performance Optimization

Power BI and Integration with Other Tools

Scenario-Based and Troubleshooting

Security and Deployment

3. What is the difference between `DELETE` and `TRUNCATE`?

10. How do `GROUP BY` and `HAVING` clauses work?

1. What is the difference between `WHERE` and `HAVING` clauses?

3. How does `UNION` differ from `UNION ALL`?

10. What is a `CROSS APPLY` and `OUTER APPLY`?

Clustered Index: Sorts the data in the table itself, only one allowed per table.
Non-Clustered Index: Creates a separate structure for the index, pointing to the actual data. Multiple non-clustered indexes are allowed.

Using appropriate indexes on frequently filtered columns.
Avoiding `SELECT *`; instead, specifying required columns.
Replacing cursors with joins or set-based operations when possible.
Writing SARGable (Search ARGument Able) queries, avoiding functions

in `WHERE` clauses.
Analyzing the query execution plan to identify bottlenecks.

6. What is the difference between `EXISTS` and `IN`? When should you use one over the other?

Answer: `EXISTS`
is generally faster for subqueries with large datasets, especially
if the outer query has a smaller number of rows. `IN`
is more readable for smaller datasets or constant sets.

Answer: Row-level locks allow more granular
control, locking only the specific rows in use. Table-level locks lock the
entire table, affecting performance but simplifying concurrency control.

Answer: This uses a correlated subquery to compare each employee's salary to their department's average.
SELECT `name, department, salary`
FROM `employees e`
WHERE `salary` > `(`SELECT AVG`(salary)`
FROM `employees`
WHERE `department` = `e.department);`

Answer: Use `DENSE_RANK` over a partition by department.
SELECT `department, name, salary`
FROM `(`
SELECT `department, name, salary,`
DENSE_RANK`()` OVER `(`PARTITION BY `department` ORDER BY `salary` DESC`)` AS `salary_rank`
FROM `employees`
`)` AS `ranked`
WHERE `salary_rank` = 3`;`

Answer: Use `SUM` as a window function with `OVER`.

SELECT date`, sales,`
SUM`(sales)` OVER `(`ORDER BY date`)` AS `cumulative_sales`
FROM `daily_sales;`

Answer: This example uses CTEs and `ROW_NUMBER` to mark duplicates.

WITH `CTE` AS `(`
SELECT *`,` ROW_NUMBER`()` OVER `(`PARTITION BY `column1, column2` ORDER BY `id)` AS `row_num`
FROM `table_name`
`)`
DELETE FROM `CTE` WHERE `row_num` > 1`;`