Post

Getting Started with SQL

A comprehensive guide on the basics of SQL and learning resources to become a better SQL programmer.

Introduction to SQL


Structured Query Language (SQL) is the standard language used to interact with databases. It allows users to create, manipulate, and query data stored in a RDBMS. SQL has become the backbone of many software applications, enabling efficient data handling, from small-scale personal databases to large-scale enterprise systems.

Some functions of SQL include:

  • Creation and modification of database structures like tables, indexes, and schemas
  • Used to insert, update, delete, and retrieve data from a database
  • SQL’s query capabilities are one of its strongest features, allowing users to extract specific data from large databases efficiently
  • Allows you to manage database permissions, ensuring that only authorized users can access or modify data

Overview of SQL’s Different Categories


1. Data Definition Language (DDL)

DDL focuses on defining and managing a database’s structure. This covers building, editing, and deleting tables, indexes, views, and other database structures. Unlike Data Manipulation Language (DML) commands, which deal with real data, DDL instructions are concerned with metadata—the database’s structure and design. DDL is used when you create, modify, or delete a table, as well as to define relationships between tables. These commands are executed as standalone statements and do not require committing because DDL commands make modifications that are automatically committed.

  • The CREATE command is used to create new database objects such as tables, indexes, and views. It defines the structure and the relationships between these objects.

  • The ALTER command is used to modify an existing database object. You can use it to add, delete, or modify columns in a table, or to change the properties of a database object.

  • The DROP command is used to delete an existing database object. When you drop a table, all the data within it is also deleted, so it should be used with caution.

  • The TRUNCATE command removes all rows from a table, effectively resetting it to its empty state, but it does not delete the table structure. This command is often faster than using the DELETE command because it doesn’t generate individual row delete operations.

Examples of Each Command in a Simple Database Schema

To better understand how these DDL commands work in practice, let’s consider a simple database schema for a company that tracks employees and departments.

  1. Creating a Database Schema

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
     CREATE TABLE Departments (
     DepartmentID INT PRIMARY KEY,
     DepartmentName VARCHAR(100)
     );
    
     CREATE TABLE Employees (
     EmployeeID INT PRIMARY KEY,
     FirstName VARCHAR(50),
     LastName VARCHAR(50),
     HireDate DATE,
     DepartmentID INT,
     FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
     );
    

    In this schema, we have two tables: Departments and Employees. The Departments table contains information about different departments, and the Employees table contains employee details. The DepartmentID in the Employees table is a foreign key that references the DepartmentID in the Departments table, establishing a relationship between the two tables.

  2. Altering the Schema

    1
    2
    
     ALTER TABLE Employees
     ADD Email VARCHAR(100);
    

    After creating the Employees table, we realized that we also need to store the employees’ email addresses. The ALTER command allows us to add the Email column to the existing Employees table.

  3. Dropping a Table

    1
    
     DROP TABLE Departments;
    

    If the company decides to restructure and no longer needs the Departments table, the DROP command can be used to remove it from the database entirely.

  4. Truncating a Table

    1
    
     TRUNCATE TABLE Employees;
    

    If the company wants to retain the Employees table structure but clear out all the current employee records (perhaps to prepare for a new data import), the TRUNCATE command is ideal.

2. Data Manipulation Language (DML)

DDL focuses on managing and manipulating the data stored in a database’s tables. While Data Definition Language (DDL) is concerned with the database’s structure, Data Management Language (DML) works directly with the data itself, allowing you to insert new entries, update current data, and delete records that are no longer required. Unlike DDL commands, DML commands do not automatically commit changes to the database; instead, they are typically conducted within a transaction, allowing changes to be reversed if necessary.

  • The INSERT command is used to add new records (rows) to a database table. Each new row is inserted as a separate entry, with the data specified for each column in the table.

  • The UPDATE command is used to modify existing records in a database table. It allows you to change the values of one or more columns for rows that match a specific condition.

  • The DELETE command is used to remove one or more rows from a database table. The rows to be deleted are identified by a condition specified in the WHERE clause.

Examples of Each Command in a Simple Database Schema

To better understand how DML commands work, let’s continue using our simple database schema from the DDL section, focusing on the Employees table.

  1. Inserting Data into the Table

    1
    2
    
     INSERT INTO Employees (EmployeeID, FirstName, LastName, HireDate, DepartmentID, Email)
     VALUES (2, 'Jane', 'Smith', '2024-03-10', 1, 'jane.smith@example.com');
    

    This INSERT command adds a new record for Jane Smith into the Employees table, capturing her personal details and associating her with a department.

  2. Update Existing Data

    1
    2
    3
    
     UPDATE Employees
     SET DepartmentID = 3
     WHERE EmployeeID = 2;
    

    After Jane Smith transfers to a different department, this UPDATE command changes her DepartmentID to 3 in the Employees table, reflecting her new department.

  3. Deleting Data from the Table

    1
    2
    
     DELETE FROM Employees
     WHERE EmployeeID = 2;
    

    If Jane Smith leaves the company, this DELETE command removes her record from the Employees table, ensuring that her data is no longer stored in the system.

3. Data Query Language (DQL)

DQL is used to query and retrieve information from a database. DML modifies data, whereas DQL selects and displays data contained in database tables. The SELECT command is the foundation of DQL and knowing it is required for efficient data querying. Unlike other SQL queries, DQL commands are typically read-only, which means they do not modify the data but rather obtain it for viewing or subsequent processing.

Examples of SELECT Command in a Simple Database Schema

To demonstrate how the SELECT command works in practice, let’s continue using our simple database schema with the Employees and Departments tables.

  1. Basic Data Retrieval

    1
    
     SELECT EmployeeID, FirstName, LastName FROM Employees;
    

    This basic SELECT statement fetches the EmployeeID, FirstName, and LastName columns from the Employees table, showing a list of all employees.

  2. Filtering Data with WHERE

    1
    2
    
     SELECT FirstName, LastName FROM Employees
     WHERE HireDate > '2023-01-01';
    

    This query retrieves the first and last names of employees hired after January 1, 2023, filtering the results based on the hire date.

  3. Using Aggregate Functions

    1
    
     SELECT AVG(Salary) AS AverageSalary FROM Employees;
    

    This query calculates the average salary of all employees in the Employees table, providing a single summary value labeled as AverageSalary.

  4. Joining Tables to Retrieve Related Data

    1
    2
    3
    
     SELECT Employees.FirstName, Employees.LastName, Departments.DepartmentName
     FROM Employees
     INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
    

    This query joins the Employees and Departments tables to display each employee’s first name, last name, and the name of their department.

  5. Sorting Results with ORDER BY

    1
    2
    
    SELECT FirstName, LastName FROM Employees
    ORDER BY HireDate DESC;
    

    This query retrieves the first and last names of employees, sorting the results by HireDate in descending order, so the most recently hired employees appear first.

4. Data Control Language (DCL) and Transaction Control Language (TCL)

Data Control Language (DCL) and Transaction Control Language (TCL) are two important subsets of SQL that focus on managing database access and controlling transactions. While DCL deals with permissions and security, TCL ensures the integrity and consistency of database transactions. The main DCL commands are GRANT and REVOKE while main commands of TCL are COMMIT, ROLLBACK and SAVEPOINT.

  • The GRANT command is used to assign permissions to users or roles, allowing them to perform specific actions on database objects such as tables, views, or procedures.
  • The REVOKE command is used to remove previously granted permissions from a user or role, thereby restricting their access to certain database objects.
  • The COMMIT command is used to save all the changes made during the current transaction to the database. Once a transaction is committed, the changes become permanent and visible to other users.
  • The ROLLBACK command undoes all changes made during the current transaction, reverting the database to its state before the transaction began. It is useful for discarding unwanted changes or recovering from errors.
  • The SAVEPOINT command is used to set a point within a transaction to which you can later roll back. This allows for partial rollback within a transaction without undoing the entire transaction.

Examples of DCL & TCL Commands in Practice

  1. Granting Permissions

    1
    
     GRANT UPDATE ON Employees TO hr_manager;
    

    This command grants the hr_manager role the ability to update records in the Employees table, allowing HR managers to make changes to employee data.

  2. Revoking Permissions

    1
    
     REVOKE SELECT ON Employees FROM temp_user;
    

    If temp_user should no longer have read access to the Employees table, this command revokes their SELECT permission.

  3. Using COMMIT in a Transaction

    1
    2
    3
    
     BEGIN TRANSACTION;
     UPDATE Employees SET Salary = Salary * 1.05 WHERE DepartmentID = 1;
     COMMIT;
    

    This sequence starts a transaction, updates the salaries of all employees in department 1 by increasing them by 5%, and then commits the changes, making them permanent.

  4. Using ROLLBACK to undo changes

    1
    2
    3
    
     BEGIN TRANSACTION;
     DELETE FROM Employees WHERE EmployeeID = 10;
     ROLLBACK;
    

    After starting a transaction and deleting an employee record with EmployeeID = 10, the ROLLBACK command is issued, undoing the deletion and restoring the record.

  5. Using SAVEPOINT

    1
    2
    3
    
     BEGIN TRANSACTION;
     UPDATE Employees SET Salary = Salary * 1.10 WHERE EmployeeID = 5;
     SAVEPOINT after_first_update;
    

    In this transaction, after updating the salary for EmployeeID = 5, a savepoint is created.

Why Mastering SQL is Important


SQL serves as the primary tool for interacting with relational databases. By understanding SQL deeply, you gain the ability to write optimized queries that can handle complex data operations, improve database performance, and ensure data accuracy and integrity. As a data engineer, mastering SQL empowers you to build robust data pipelines, automate data processing tasks, and integrate different data sources. Ultimately, SQL proficiency enhances your ability to contribute to data-intensive projects, making you a more valuable asset in any data-driven organization.

Summary


Certainly, this blog does not cover all there is to know about SQL. It was created to provide you with an overview before diving into more advanced SQL commands. SQL is a broad language with several commands and procedures to discover. Mastering SQL requires constant practice, which may be accomplished by working through online lessons and solving SQL problems. In teh below section, I’ll attach some learning resources to help you on your journey. Engaging with these problems will not only solidify your understanding but also make you a more proficient SQL programmer.

Learning Resources


  1. W3SCHOOLS
  2. sqlbolt
  3. SQLZoo
  4. DataLemur
  5. HackerRank
  6. Leetcode
This post is licensed under CC BY 4.0 by the author.