
From the editors
Recently, we conducted a survey among our readers and found that many of them would like to learn Python, and start from scratch. As an experiment, we published the article Python from Absolute Zero. Learn to code without boring books», where they talked about the basics of Python: variables, conditions, loops and lists. The feedback was positive, and we decided to continue introducing readers to Python in our signature fun style.
This article, like the previous one, is available without a paid subscription, so feel free to share these links with your friends who dream of learning Python!
Let’s start with the strings. To solve the problem that the friends faced, Cheburashka used the replace(
function, which replaces one substring in a string with another.
First, he declared a variable s
and placed there the string that Gena sent him.
s = 'We drank beer all summer. So one day I open the door and there on the threshold is Cheburashka, all drunk, drunk, and a bottle is sticking out of his pocket.'
Next, Cheburashka determined a dictionary of words that needed to be replaced.
slova = {'drank':'read', 'beer':'books', 'drunk':'well-read', 'bottle':'encyclopedia'}
And now, using the for
loop, Cheburashka went through the dictionary to replace each word (key
) with the corresponding value from the dictionary (slova[
):
for key in slova: s = s.replace(key, slova[key])print(s)
info
Dictionaries are much like lists, but their values are written in pairs: a key and a value. You can find out the meaning from the key. You can think of keys in lists as indices (0, 1, 2…), and in dictionaries as strings.
The replace(
function is useful for completely removing some words from a string. To do this, we will replace them with an empty string (if you open and close the quotation mark, you will get an empty string):
s = '''I don't like drinking beer.It's tasteless and unhealthy!'''s = s.replace('not','')print(s)
info
To write multiple strings into a variable, you can wrap them in three single quotes and do string breaks directly in the code.
To get the number of characters in a string, use the len() function.
s = 'If you really cannot sit still, write the code any way you can!'n = len(s)print(n)
And, as I already said in the previous article, you can take slices from strings as from arrays if you specify the beginning and end of the substring in square brackets after the variable. The position starts from zero.
s = 'My name is Bond, James Bond'a = s[11:15]print('Last name: ' + a)
If you need to make a cut from the beginning of the string, you don’t need to write the first digit.
Let’s say you need to find strings in a list that start with https. We iterate over them using for
, for each, we check whether the first five characters match the string https
, and if so, we output the string:
mas = [ 'This is just a string', 'https://xakep.ru', 'Another string', 'https://habr.ru' ]for x in mas: if x[:5] == 'https': print(x)
To count the number of occurrences of a substring in a string, you can use the .
method:
s = 'Guess what, in short, I bam him with an exploit on the port, and he, in short, crashed right away!'n = s.count('short')print(n)
Sometimes, there may be extra spaces or string breaks at the beginning or end of a string. Let’s remove them with a special command .
:
s = 'There is no such thing as too much beer! \n's = s.strip()print(s)
info
String breaks can be added using the \
(used in all OS) or \
(in Windows) characters. There are other special characters. For example, \
is a tab character.
To determine whether a substring exists in a string s, you can use the .
method:
n = s.find('the string we are looking for')
If the desired substring is found, then its position in the string will be stored in the variable n
, and if not found, n
will be equal to -1
.
Let’s try to determine if the string contains an email address from Xakep.ru, that is, we will look for the substring @xakep.
.
But first, we need one more string method – .
. It allows to split a string into parts by specifying a separator string as an argument. For example, s.
will split the text into paragraphs based on the string break character. If you leave the brackets empty, the default separator, a space, will be used.
s = 'This is a normal string, and it contains the email address vasya@xakep.ru'words = s.split()for w in words: n = w.find('@xakep.ru') if n != -1: print('Found email: ' + str(w) + ' at position ' + str(n))
The .
method, on the contrary, allows to glue strings together. It takes a list and returns a string where each element of the list is connected to the other through the string you called this method on.
s = 'virus is being introduced'list1 = ['one, ', 'two, ', 'three...']print(s + s.join(list1))
Formatting strings
We have printed various things many times by connecting strings with simple addition. This is not always convenient, especially considering that if you come across numbers, you will have to convert them into strings using the str(
function. There is a more beautiful and convenient way to substitute variable values into strings. More precisely, two slightly different methods.
Method 1 – using the .format() method
We can insert a pair of curly brackets into the string, and then call the string’s .
method and pass it the desired values in the order they are substituted into the string.
name = 'Vasya Pupkin'age = 20address = 'Pushkin street, Kolotushkin house'info = 'Name: {}. Age: {}. Address: {}'.format(name, age, address)print(info)
You can pass information as a list separated by an asterisk:
data = ['Vasya Pupkin', 20, 'Pushkin street, Kolotushkin house']info = 'Name: {}. Age: {}. Address: {}'.format(*data)print(info)
Method 2 – via f-strings
Another option is to write the letter f
before the string and then specify the variables directly in curly brackets.
name = 'Vasya Pupkin'age = 20address = 'Pushkin street, Kolotushkin house'info = f'Name: {name.upper()}. Age: {age}. Address: {address}'print(info)
The main advantage of this method is that you can insert a value into a string multiple times. In addition, you can change the values directly in the curly brackets: Python will first perform all the actions in them, and then substitute the resulting value into the string. So, the .
method in the example above makes all letters uppercase.
Files
The methods listed are enough to allow you to do whatever you want with strings. But where will these strings come from? Most often they are written in files, so now I will tell you how to manage them in Python.
To work with a file, you need to open it. The open(
function is used for this, and it works like this:
f = open('file name with path and extension', 'file mode', encoding='Text encoding')
There are several modes of working with files, but you are mainly interested in:
-
r
— open a file to read information from it; -
w
— open a file to write information to it (creates a new file); -
a
— open a file to append information to the end of the file (appends information to the end of an existing file); -
a+
— additional writing and reading.
To avoid problems with paths in Windows, use double slashes \
in them, and also put the letter u before the opening quote of the file path, indicating that the string is in Unicode encoding:
f = open(u'D:\\test.txt', 'r', encoding='UTF-8')
You can read string from a file using the .
method:
f = open('test.txt', 'r', encoding='UTF-8')s = f.read()print(s)
Alternatively, you can sequentially read individual string from the file using a for
loop:
f = open('test.txt', 'r', encoding='UTF-8')for x in f: print(x)
Once you have finished working with the file, you need to close it.
f.close()
info
To work with binary files, add the letter b
to the mode when opening a file:
f = open('myfile.bin', 'rb')d = f.read()print("d = ", d)
We will talk more about binary data in one of the following articles.
Let’s now try to create a new text file in the same directory as our script and write the values of some variables into it.
s1 = 'One, two, three, four, five\n's2 = 'I am going to break the server...\n'f = open('poems.txt', 'w', encoding='UTF-8')f.write(s1)f.write(s2)f.close()
Please note that at the end of each string there is a \
symbol — a transition to a new string.
Let’s say you want to add a third string to the end of this file. This is where the re-recording mode comes in handy!
s3 = 'Oh, they will get tired of fixing it!\n'f = open('poems.txt', 'a', encoding='UTF-8')f.write(s3)f.close()
To open files, it is also very convenient to use the with
construction, because thanks to the with word, the file will be closed automatically and you won’t have to think about it.
s = 'If you close this file, your disk will be formatted!\nJoke\n'with open('test.txt', 'w', encoding='UTF-8') as f: f.write(s)
Working with the web
Let’s learn how to get information from web pages. First, you need to install several modules. We write in the command line:
pip install requestspip install html2text
The requests module allows to make GET and POST requests to web pages. The html2text module is used to convert HTML code of web pages into plain text, that is, it cleans it from HTML tags.
We import our new modules at the beginning of the program and try to get some page from the Internet.
import requests# Make a GET requests = requests.get('http://xakep.ru')# Print the server response codeprint(s.status_code)# Print the HTML codeprint(s.text)
The program will print out a lot of HTML code that makes up the magazine’s main page. But what if you just want the site text, not a jumble of tags? Here html2text will help. It will extract text, headings and images from the code and return them without HTML tags.
import requestsimport html2text# Make a GET requests = requests.get('http://xakep.ru')# Server response codeprint(s.status_code)# An instance of the parser is createdd = html2text.HTML2Text()# A parameter that affects how links are parsedd.ignore_links = True# Text without HTML tagsc=d.handle(s.text)print(c)
In addition to GET requests, there are so-called POST requests, which are used to send large texts or files to the server. If you see a form on a website, especially with a file upload, then most likely, when you click the “Submit” button, a POST request will be made.
The requests library also allows to make POST requests. You may find this useful for simulating user actions — for example, if you need to automate work with a website. You can even use this as a home-made Burp alternative!
Let’s see how to send a regular POST request. Let’s assume that there is a guest.
script on the site.
website, which accepts the user name name
and message message
from the form via a POST request, and then posts them to the guestbook.
import requests# Variables to be sent via POST requestuser = 'coolhacker'message = 'You have beeh pwned!!!'# We make a POST request and pass a dictionary of fieldsr = requests.post("http://site.ru/guest.php", data={'user': user, 'message': message})print(r.status_code)
Now let’s send a request with the payload.
file as an attachment and the same two form fields as in the previous request. The file will come to the server under the name misc.
.
import requestsuser = 'kitty2007'message = '(* ^ ω ^)'# Open the file in binary modewith open('payload.php', 'rb') as f: # POST request with file sending r = requests.post('http://site.ru/upload.php', files={'misc.php': f}, data={'user': user, 'message': message})
All that’s left is to learn how to download files. This is a lot like requesting pages, but it’s best to do it in streaming mode (stream=True
). We will also need the shutil module, which has a convenient copyfileobj
function. It allows to copy the contents of binary files — in our case, from the Internet to our disk.
import requestsimport shutilimport os# File to downloads = 'https://xakep.ru/robots.txt'# Using the os.path.split(s) function, we extract the path to the file and its name from the stringdirname, filename = os.path.split(s)# GET request in stream=True mode to download a filer = requests.get(s, stream=True)# If the server response is successful (200)if r.status_code == 200: # Create a file and open it in binary mode for writing with open(filename, 'wb') as f: # Decode the data stream based on the content-encoding header r.raw.decode_content = True # Copying data stream from the Internet to a file using the shutil module shutil.copyfileobj(r.raw, f)
info
Server response codes help you understand how your request was performed. Code 200 means that the server successfully processed the request and gave us a response, code 404 — the page was not found, 500 — an internal server error, 503 — the server is unavailable, and so on. Full list of status codes can be found on Wikipedia.
Error handling
Before I look at a more real-world example, I need to show you one more language construct that is indispensable when working with files and the network. This is handling exceptional situations, that is, errors.
Often, when running a program, the computer encounters various problems. For example, file not found, network unavailable, disk space out. If the programmer has not taken care of this, the Python interpreter will simply exit with an error. But there is a way to anticipate problems right in the code and continue working — the try...
construct.
It looks like this:
try: # There are some commands here, # which may lead to an errorexcept: # Our actions if an error occurs
You can catch specific types of errors by specifying the type name after the except
keyword. For example, KeyboardInterrupt
is triggered if the user tries to terminate a program by pressing Ctrl-C. It is in our power to prohibit this from happening!
Heck, we can even allow division by zero if we catch the ZeroDivisionError
error. This is what it will look like:
try: k = 1 / 0except ZeroDivisionError: k = 'over 9000'print(k)
Writing a port scanner
Now we’ll write our own port scanner! It will be simple, but quite functional. The socket module, which implements work with sockets, will help us with this.
info
A socket is an interface for exchanging data between processes. There are client and server sockets. The server socket listens on a specific port waiting for clients to connect, and the client socket connects to the server. Once a connection has been established, data exchange begins.
This is what the code will look like.
import socket# List of ports to scanports = [20, 21, 22, 23, 25, 42, 43, 53, 67, 69, 80, 110, 115, 123, 137, 138, 139, 143, 161, 179, 443, 445, 514, 515, 993, 995, 1080, 1194, 1433, 1702, 1723, 3128, 3268, 3306, 3389, 5432, 5060, 5900, 5938, 8080, 10000, 20000]host = input('Enter the site name without http/https or IP address: ')print ("Please wait, port scanning in progress!")# In a loop, we iterate through the ports from the listfor port in ports: # Create a socket s = socket.socket() # Set a timeout of one second s.settimeout(1) # Catch errors try: # Try to connect, pass host and port as a list s.connect((host, port)) # If the connection caused an error except socket.error: # then we do nothing pass else: print(f"{host}: {port} active") # Close the connection s.closeprint ("Scanning complete!")
As you can see, nothing complicated!
Homework
Make the port scanner get the list of IPs from one file and write the scan results to another.
In the previous article, you learned how to work with the clipboard. Write a program that continuously runs and periodically receives the clipboard contents. If it has changed, it adds it to the end of the
monitoring.
Try to log only those intercepted strings that contain Latin letters and numbers, this way you are more likely to catch passwords.txt. -
Write a program that reads a file of this type:
Ivan Ivanov|ivanov@mail.ru|Password123Dima Lapushok|superman1993@xakep.ru|1993supermanVasya Pupkin|pupok@yandex.ru|qwerty12345Frodo Baggins|Frodo@mail.ru|MoRdOr100500Kevin Mitnick|kevin@xakep.ru|dontcrackitpleaseUser Userson|uswer@yandex.ru|aaaa321The program should sort the strings by domains from the email, create a file for each domain, and place a list of email addresses in each file.
Write a program that goes through the sites on the list, downloads the robots.txt and sitemap.xml files and saves them to disk. If the file is not found, a message about this is displayed.
That’s all for today. In the next article, you’ll learn how to work with the OS file system, understand functions, discover the power of regular expressions, and write a simple SQL vulnerability scanner. Don’t miss it!

2023.07.07 — VERY bad flash drive. BadUSB attack in detail
BadUSB attacks are efficient and deadly. This article explains how to deliver such an attack, describes in detail the preparation of a malicious flash drive required for it,…
Full article →
2022.06.01 — Quarrel on the heap. Heap exploitation on a vulnerable SOAP server in Linux
This paper discusses a challenging CTF-like task. Your goal is to get remote code execution on a SOAP server. All exploitation primitives are involved with…
Full article →
2022.02.09 — First contact: An introduction to credit card security
I bet you have several cards issued by international payment systems (e.g. Visa or MasterCard) in your wallet. Do you know what algorithms are…
Full article →
2022.06.01 — F#ck AMSI! How to bypass Antimalware Scan Interface and infect Windows
Is the phrase "This script contains malicious content and has been blocked by your antivirus software" familiar to you? It's generated by Antimalware Scan Interface…
Full article →
2023.02.21 — Pivoting District: GRE Pivoting over network equipment
Too bad, security admins often don't pay due attention to network equipment, which enables malefactors to hack such devices and gain control over them. What…
Full article →
2023.04.19 — Kung fu enumeration. Data collection in attacked systems
In penetration testing, there's a world of difference between reconnaissance (recon) and data collection (enum). Recon involves passive actions; while enum, active ones. During recon,…
Full article →
2022.02.15 — EVE-NG: Building a cyberpolygon for hacking experiments
Virtualization tools are required in many situations: testing of security utilities, personnel training in attack scenarios or network infrastructure protection, etc. Some admins reinvent the wheel by…
Full article →
2022.02.15 — First contact: How hackers steal money from bank cards
Network fraudsters and carders continuously invent new ways to steal money from cardholders and card accounts. This article discusses techniques used by criminals to bypass security…
Full article →
2023.02.21 — Herpaderping and Ghosting. Two new ways to hide processes from antiviruses
The primary objective of virus writers (as well as pentesters and Red Team members) is to hide their payloads from antiviruses and avoid their detection. Various…
Full article →
2022.02.09 — Dangerous developments: An overview of vulnerabilities in coding services
Development and workflow management tools represent an entire class of programs whose vulnerabilities and misconfigs can turn into a real trouble for a company using such software. For…
Full article →