Python for test automation: data types
Data types in programming refer to the classification or categorization of data that determines the type of values a variable can hold and the operations that can be performed on those values. In most programming languages, data types include fundamental types such as integers, floating-point numbers, characters, and booleans, as well as more complex types such as arrays, strings, structures, and objects. Each data type has specific characteristics, storage requirements, and behavior, which dictate how the data is represented in memory and how it can be manipulated by the program.
In Python, data types are dynamically inferred, meaning that variables do not require explicit declaration of their data type. Instead, Python automatically determines the type of a variable based on the value assigned to it at runtime. Python supports a variety of built-in data types, including integers, floating-point numbers, strings, booleans, lists, tuples, dictionaries, sets, and more. Additionally, Python is dynamically typed, allowing variables to change their type during execution if reassigned with a different type of value. Python also provides type hinting, which allows developers to specify the expected data type of function parameters and return values, although this is optional and does not affect the runtime behavior of the code. Type hinting is something that can massively improve the quality of our code, so we will be rigorously applying type hinting in our code from the start.
strings (str)
A string is a sequence of characters, typically used to represent text data. Characters in a string can include letters, digits, symbols, and whitespace. Strings are enclosed within quotation marks, either single (' ') or double (" "), depending on the programming language syntax. In Python either single or double quotes may be used, but it is a good idea to stick to one way in a single program. They can be manipulated and processed in various ways, such as concatenating (joining) multiple strings together, extracting substrings, searching for specific patterns, and performing text manipulation operations like replacing or splitting. Strings are a fundamental data type in most programming languages and are extensively used for tasks involving textual data processing, user input handling and file manipulation.
x = "hello"
print(x)
>>>hello
For test automation, strings can come up for example in verifying data, finding data based on text, finding elements in a web page based on text, etc.
Formatting
String formatting refers to the process of constructing strings by embedding values or variables within a string template. This allows developers to create dynamic strings that incorporate variable data or values, making them more flexible and versatile.
format method
The str.format() method provides more flexibility by allowing positional or keyword arguments to be substituted into placeholders within a string.
For example we can format
s = "hello {}, nice to meet {}"
print(s.format("Tomi", "you"))
>>>hello Tomi, nice to meet you
We can also use indices to direct where in the string our formatting values go:
s = "hello {0}, place {2} in specific positions using {1}"
print(s.format("Tomi", "indices", "strings"))
>>>hello Tomi, place strings in specific positions using indices
Keyword arguments are also available for formatting:
s = "hello {name}"
print(s.format(name="Tomi"))
>>>hello Tomi
% formatting
With %-formatting, placeholders within a string template are replaced with values using the % operator.
s = "hello %s, you are number %d"
print(s % ("Tomi", 1))
>>>hello Tomi
%s
is for strings, %d
is for integers and %f
is for floats. There are several others, that you can read more about here.
f-string formatting
f-strings offer a concise and readable syntax for string interpolation, allowing expressions and variables to be directly embedded within curly braces inside a string. f-strings are the modern way of formatting strings in Python.
name = "Tomi"
print(f"hello {name}")
>>>hello Tomi
Instead of calling the format
method of a string, we simply place the character f before the starting quote of a string, and place a variable inside curly-braces in the string. The value of the variable is placed in the string similar to using format
with keyword arguments.
integers (int)
An integer is a whole number without any fractional or decimal component. Integers can be either positive, negative, or zero. They are typically used to represent discrete quantities, such as counts, indices, or identifiers, in computational tasks. Unlike floating-point numbers, integers do not include a decimal point or fractional component, making them suitable for operations that require precise whole-number arithmetic. Integers are a fundamental data type in most programming languages and are used extensively in mathematical calculations, logical operations, and various other computational tasks.
x = 1
print(x)
>>>1
x = -2356
print(x)
>>>-2356
x = 10_000
print(x)
>>>10000
Above are a few examples on how to assign integers to variables. Large integers can be made easier to read by placing an underscore _
between two numbers. In terms of the actual value of the integer, underscores are ignored, they serve only as a visual aid to programmers.
For test automation, integers can come up for example in verifying data, timeouts for functions (or other), retrying functions for specific amount of times, indices in a list, etc.
floating point numbers (float)
A float (short for floating-point) is a data type used to represent numbers with fractional components. Floats can store both whole numbers and decimal values, making them suitable for a wide range of mathematical calculations involving real numbers. They are characterized by their ability to "float" the decimal point, allowing them to represent values with varying levels of precision. Floats are commonly used when working with quantities that require precision beyond that of integers, such as measurements, scientific calculations, and financial data. However, it's important to note that due to the way floating-point numbers are represented in computers, they may not always be capable of exact representation, leading to potential rounding errors in calculations.
x = 1.0
print(x)
>>>1.0
x = -2356.552
print(x)
>>>-2356.552
x = 10_000.001_055
print(x)
>>>10000.001055
Above are a few examples on how to assign integers to variables. Similar to integers, floats can be made easier to read by placing an underscore _
between two numbers without affecting its value.
float formatting in strings
In each type of string formatting, we can define the amount of decimals we wish to show for a float by adding .xf
or :.xf
in the format, where x is the number of decimal numbers to show. For example:
number = 10.123123
print("%.3f" % number)
>>>10.123
print("{:.2f}".format(number))
>>>10.12
print(f"{number:.1f}")
>>>10.1
boolean (bool)
A bool is a data type used to represent logical values. It can only have one of two possible values: true or false. Booleans are essential for controlling the flow of a program through conditional statements, such as if-else statements and loops. They are also used to represent the result of logical operations, comparisons, and Boolean algebra. Booleans are named after the mathematician George Boole, whose work laid the foundation for modern digital computer logic. They play a crucial role in programming by allowing developers to make decisions based on conditions and evaluate the truthfulness of expressions.
Booleans are extensively used in testing to control test flow based on certain conditions and verifying expected outcomes. In Python verifying expected outcomes mostly happens by utilizing the assert
, which takes two arguments, the first being an operation resulting in a boolean value, and optionally a custom error message.
In Python, the two values for booleans are True
and False
, notice the first characters of the words are capitalized.
b = True
print(b)
>>>True
b = False
print(b)
>>>False
list
A list is a data structure that allows storing multiple elements of the same type in a single variable. Lists provide a way to organize and manipulate collections of data efficiently. Each element in a list is identified by an index, which represents its position within the collection. These indices typically start from 0 in many programming languages. Lists can hold various data types, including integers, floats, strings, and even other Lists, depending on the language's capabilities. In Python, lists can be constructed from any other data type including other lists. They are widely used for tasks such as storing and accessing sequential data, implementing data structures like stacks and queues, and performing mathematical operations on collections of values. Lists are fundamental components of programming and are supported in almost all programming languages, each with its own set of features and functionalities.
For test automation, lists can be used for storing test data, storing test results for example iterating through elements in a software application. They can also be used to store errors (or exceptions) encountered during test execution.
l = [0, 1, 2]
print(l)
>>>[0, 1, 2]
Methods
There are a few important methods we can use to manipulate the contents of a list or find something in a list.
append
: adds an element to the end of the listextend
: extends the list by appending elements from an iterable, for example another list or tupleinsert
: inserts an element at a specified indexpop
: removes and returns the element at a specified index, or the last element if no index is specifiedindex
: returns the index of the first occurrence of a specified valuecount
: returns the number of occurrences of a specified valuesort
: sorts the elements of the list in ascending order by default, or in a specified order using optional parametersreverse
: reverses the order of the elements in the list
tuple
A tuple is a data structure that is used to store a fixed sequence of elements. Unlike lists, tuples are immutable, meaning their contents cannot be modified once they are created. Tuples are typically ordered collections, meaning the order of elements within a tuple is significant and preserved. Elements within a tuple can be of different data types, and they are accessed using zero-based indexing. Tuples are commonly used to group related data together, especially when the number of elements and their types are known in advance. They are useful for returning multiple values from a function, representing coordinates or key-value pairs, and organizing data in a structured format.
For test automation, tuples are mostly used for returning multiple values from a function, or to assert multiple values at once (since they are immutable, we may be sure we that what we are comparing has not been accidentally modified).
Methods
Since tuples are immutable, there are only a couple of methods available:
count
: returns the number of occurrences of a specified valueindex
: returns the index of the first occurrence of a specified value
set
A set is a data structure that represents a collection of unique elements with no specific order. Unlike lists or tuples, sets do not allow duplicate elements; each element in a set is unique. Sets are typically used for tasks that involve membership testing, removing duplicates, and performing set operations such as union, intersection, and difference. Sets are highly efficient for checking whether a specific element is present in the collection due to their internal implementation using hashing. They are commonly used in scenarios where uniqueness of elements and efficient membership testing are important, such as counting unique items, removing duplicates from data, or performing mathematical operations on collections of elements.
For test automation, sets can be used for removing duplicates from data read from multiple sources, or membership testing, i.e. checking the presence (or absence) of an expected element without having to go through an entire list of elements.
s = set()
s.add(1)
s.add(2)
s.add(1)
s.add(2)
print(s)
>>>{1, 2}
l = [1, 2, 2, 1]
s = set(l)
print(s)
>>>{1, 2}
Methods
There are a few important methods that enable us to to modify the set or compare it to another set:
add
: adds a single element to the setupdate
: adds elements from another iterable (such as another set or list) to the setremove
: removes a specified element from the set. If the element is not present, it raises a KeyErrordiscard
: similar to remove(), but if the element is not present, it does nothing instead of raising an errorclear
: removes all elements from the set, leaving it emptyunion
: returns a new set containing all unique elements present in both sets (using the | operator also does the same)intersection
: returns a new set containing only elements that are present in both sets (using the & operator also does the same)difference
: returns a new set containing elements that are present in the first set but not in the second set (using the - opeator also does the same)
dict
A map, also known as a dictionary, associative array, or hash table, is a data structure that stores key-value pairs. Each key in a map is unique and is associated with a specific value. Maps allow for efficient retrieval, insertion, and deletion of elements based on their keys. They are commonly used for tasks such as storing and retrieving data by a unique identifier, implementing lookup tables, and mapping relationships between entities. Maps are highly versatile and widely used in various programming scenarios, providing a flexible way to organize and manipulate data based on key-value associations.
For test automation, knowing how to use dictionaries is important due to the format of data that is highly utilized in the web (JSON - JavaScript Object Notation). This kind of data is essentially a mapping of key-value pairs where the key is always a string, and data can be any of string, number (integer or float), object (JSON object), array (list), boolean or null (None). JSON at its core is just text, but it can be parsed into a dict data type in Python.
d = {
"a": 1,
"b": [0, 1, 2],
}
print(d)
>>>{'a': 1, 'b': [0, 1, 2]}
print(d["a"])
>>>1
d = dict(a=1, b=[0, 1, 2])
print(d)
>>>{'a': 1, 'b': [0, 1, 2]}
Methods
get
: returns the value associated with a specified key, if the key is not found, it returns a default value, which is None by defaultkeys
: returns a view object containing the keys of the dictionaryvalues
: returns a view object containing the values of the dictionaryitems
: returns a view object containing the key-value pairs of the dictionary as tuplesupdate
: updates the dictionary with the key-value pairs from another dictionary or iterablepop
: removes and returns the value associated with a specified key, if the key is not found, raises a KeyErrorpopitem
: removes and returns an arbitrary key-value pair from the dictionaryclear
: removes all key-value pairs from the dictionary, leaving it empty
Summary
From a test automation perspective, these are the 8 data types that you should know by heart.
str (string): Strings are fundamental for representing textual data. Defined in code by enclosing text within double quotes "
or single-quotes '
.
int (integer): Integers represent whole numbers, which are commonly used in arithmetic operations, counting, indexing, and looping constructs.
float (floating point number): Floating-point numbers represent decimal values and are used for more precise numerical calculations.
bool (boolean): Booleans represent binary logic values, True or False. Understanding boolean logic is essential for control flow, decision-making, and conditional statements in programming.
list: Lists are versatile data structures used to store ordered collections of elements. Defined in code by enclosing a comma-separated sequence of items within brackets []
.
tuple: Tuples are similar to lists but are immutable, meaning their elements cannot be modified after creation. Defined in code by enclosing a comma-separated sequence of items within parentheses ``.
set: Sets are unordered collections of unique elements. Defined on code by enclosing a comma-separated sequence of items within curly-braces {}
.
dict (dictionary): Dictionaries are key-value pairs used to store and retrieve data efficiently. Defined in code by enclosing colon-separated key-value pairs, e.g. "a": 1
, within curly-braces {}
.