
In this article, I’ll try to explain in a way that’s easy to understand, even for those who aren’t familiar with coding, what can and cannot be done and how labor-intensive the whole process is. I won’t be providing ready-to-use source code, but there will be small examples to illustrate the points.
As you may know, Telegram has chats and channels that can accumulate a large number of users. Having a list of these users can sometimes be quite useful, for example, for sending out newsletters or invitations.
In the context of Telegram, the term “parsing” usually refers to extracting a list of users from a channel or chat. Less commonly, it can also mean retrieving a list of messages.
Channels
Let’s start with channels. A channel in Telegram is a type of resource where users can only read messages from the channel owner. They cannot post messages themselves, except in cases where the channel has a linked comment chat. In that scenario, subscribers have the ability to comment on the owner’s messages.
You can obtain a list of subscribers from a channel without an attached comment chat only if it’s your own channel and it has fewer than 200 subscribers. If even one of these conditions is not met, technically speaking, parsing is impossible, and no promises can change that. There might be new methods in the future, either legal or exploiting loopholes, but as of now, there are no working solutions.
If the chat with comments exists, you can scrape the users in the same way as with any other chat.
Regarding the list of messages in a channel, you can access it either programmatically through the Telegram API or manually by exporting the message list using the standard client.


Chats
Chats present a more intriguing challenge. Extracting a list of users manually using the standard client is nearly impossible unless you’re prepared to jot down all the information you need with a pen and notebook. This isn’t very practical, so it’s better to turn to Telegram’s native API, or, to make things easier, use a library like Telethon.
In Telethon, there’s a function called GetParticipantsRequest, which takes an entity as input (entity) and outputs a list of users.
Let’s try feeding it a chat session.
async def test1(client): chat_id = 'https://t.me/kakoy-to-chat' chat_entity = await client.get_entity(chat_id) participants = await client(GetParticipantsRequest( chat_entity, ChannelParticipantsSearch(''), offset=0, limit=200, hash=0)) for user in participants.users: print(user) return
Let’s see what can be achieved using this function:
User(id=306742xxx,
is_self=False,
contact=False,
mutual_contact=False,
deleted=False,
bot=False,
bot_chat_history=False,
bot_nochats=False,
verified=False,
restricted=False,
min=False,
bot_inline_geo=False,
support=False,
scam=False,
apply_min_photo=True,
fake=False,
access_hash=669983103xxxxx,
first_name='??\u200d>?',
last_name=None,
username='prosto_user_name',
phone=None,
photo=UserProfilePhoto(photo_id=13174487829112xxxx,
dc_id=2,
has_video=False,
stripped_thumb=b'\x01\x08\x08\x04\xe0\xaa\xe0\x8f\x9b\x8cQE\x14\x90\xcf'),
status=UserStatusRecently(),
bot_info_version=None,
restriction_reason=[],
bot_inline_placeholder=None,
lang_code=None)
The most commonly required fields include id
, username
, first_name
, last_name
, and phone
. Additionally, there are numerous attributes such as bot
, verified
, scam
, fake
, photo
, status
, and others.
As you can see, the information varies greatly. Some Telegram parsing specialists manage to claim that they only obtained IDs, while usernames and phone numbers come at an extra cost. Clever, to say the least!
Phones will only appear on this list if the user hasn’t disabled the option to display their phone to everyone in the settings.
By the way, it’s sometimes suggested to also determine a user’s gender. Telegram does not provide or have such data. I’m only aware of two ways to obtain this information:
- Analyze usernames and real names by checking them against a pre-existing database to draw conclusions where possible. For instance, if a username is something like Karina, Julia, or Alena, one might assume it belongs to a woman.
- Download all messages from chats for each user, extract the verbs, and determine how often they end with the letter “a”. It is logical to assume that instances of this occurring would be much more frequent in messages from women than from men.
It is clear that both methods provide no guarantees and only allow for the determination of gender with a certain level of probability. Additionally, they require extra effort.
When closely examining the output of GetParticipantsRequest
, we can see that regardless of the number of chat participants or the limit
parameter, it only returns a maximum of 200 users. This is sufficient when the group has fewer than 200 members. However, if there are more, additional effort will be needed.
My experiments with the offset
parameter revealed that it’s used to specify an offset in the list of users. By default, this offset is set to zero, but if you implement a loop and increment the offset
with each iteration, you can download 200 users at a time and parse almost indefinitely (or at least until you run out of users). For example, like this:
offset = 0while True: participants = await client(GetParticipantsRequest( channel, ChannelParticipantsSearch(''), offset, limit, hash=0)) if not participants.users: break #... # Here we do something with the users from the list participants.users #... offset += len(participants.users)
However, it quickly becomes apparent that the GetParticipantsRequest
function returns a maximum of 10,000 users. So far, we haven’t figured out how to increase this limit. Some believe it might be impossible.
The filter
parameter allows you to specify criteria that the returned results must meet.
Here are the options:
-
ChannelParticipantsAdmins
; -
ChannelParticipantsBanned
; -
ChannelParticipantsBots
; -
ChannelParticipantsContacts
; -
ChannelParticipantsKicked
; -
ChannelParticipantsMentions
; -
ChannelParticipantsRecent
; -
ChannelParticipantsSearch
.
At this point, you can start experimenting, like trying to get a list of all admins or see who is currently online. Parsing online users is actually a great idea. By doing this regularly, you can filter out inactive members who joined the group and then forgot about it.
The parameter ChannelParticipantsSearch
is what we should focus on, as it allows us to search for users by their username or part of it. Let’s try to set up a loop:
chat_id = 'https://t.me/stepnru' chat_entity = await client.get_entity(chat_id) keys = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] for key in keys: offset = 0 participants = await client(GetParticipantsRequest( chat_entity, ChannelParticipantsSearch(key), offset, limit=200, hash=0)) print(key + ": " + str(participants.count))
Let me explain: we went through the entire alphabet, checking each letter to find users whose user_name
contains it.
Let’s see what we’ve got:
A: 28068
B: 11188
C: 5721
D: 15950
E: 7522
F: 5280
G: 8812
H: 4002
I: 9233
J: 3642
K: 15177
L: 8264
M: 20343
N: 10546
O: 5903
P: 9001
Q: 1009
R: 9882
S: 22445
T: 9881
U: 2376
V: 12249
W: 2581
X: 1749
Y: 4324
Z: 4283
As you can see, sometimes the results list contains fewer than 10,000 entries, in which case we can retrieve the entire list. Other times, it contains more than 10,000 entries, and then we can only access the first 10,000. However, a test conducted on a group of 190,000 users allowed us to gather data on 140,000 of them, which is quite substantial!
There are certainly other ways to experiment with filters and extract even more people from the chat. Consider it your homework assignment.
Please note: this method takes much longer, and parsing a group with several dozen users can take up to several dozen minutes.
I recommend saving results not in a text file, but in a database, such as SQLite:
def add_users_in_base(bd_name, users): sqlite_connection = sqlite3.connect(bd_name) cursor = sqlite_connection.cursor() for user in users: sqlite_insert_query = "INSERT INTO users (id, deleted, bot, bot_chat_history ..... phone) VALUES (?,?,?,?,?,?,?,?) " data_tuple = ( user.id, user.deleted, user.bot, user.bot_chat_history, .... user.phone) try: cursor.execute(sqlite_insert_query, data_tuple) except sqlite3.Error as er: pass sqlite_connection.commit() cursor.close() sqlite_connection.close()
Duplicates are filtered out right from the start, making it much easier to work with the data afterwards—whether you’re searching, sorting, or converting it.
Conclusions
I’ve demonstrated how to extract information about 10,000 chat participants, and with the use of filters, you can handle even more. With some experimentation, you’ll be able to write scripts that collect the data you need in a format that’s convenient for you.
If you know any other interesting tips on this topic, don’t forget to share them in the comments!

2022.12.15 — What Challenges To Overcome with the Help of Automated e2e Testing?
This is an external third-party advertising publication. Every good developer will tell you that software development is a complex task. It's a tricky process requiring…
Full article →
2022.02.09 — F#ck da Antivirus! How to bypass antiviruses during pentest
Antiviruses are extremely useful tools - but not in situations when you need to remain unnoticed on an attacked network. Today, I will explain how…
Full article →
2023.03.26 — Poisonous spuds. Privilege escalation in AD with RemotePotato0
This article discusses different variations of the NTLM Relay cross-protocol attack delivered using the RemotePotato0 exploit. In addition, you will learn how to hide the signature of an…
Full article →
2022.04.04 — Elephants and their vulnerabilities. Most epic CVEs in PostgreSQL
Once a quarter, PostgreSQL publishes minor releases containing vulnerabilities. Sometimes, such bugs make it possible to make an unprivileged user a local king superuser. To fix them,…
Full article →
2022.01.11 — Pentest in your own way. How to create a new testing methodology using OSCP and Hack The Box machines
Each aspiring pentester or information security enthusiast wants to advance at some point from reading exciting write-ups to practical tasks. How to do this in the best way…
Full article →
2022.01.12 — First contact. Attacks against contactless cards
Contactless payment cards are very convenient: you just tap the terminal with your card, and a few seconds later, your phone rings indicating that…
Full article →
2022.01.13 — Bug in Laravel. Disassembling an exploit that allows RCE in a popular PHP framework
Bad news: the Ignition library shipped with the Laravel PHP web framework contains a vulnerability. The bug enables unauthorized users to execute arbitrary code. This article examines…
Full article →
2022.06.03 — Challenge the Keemaker! How to bypass antiviruses and inject shellcode into KeePass memory
Recently, I was involved with a challenging pentesting project. Using the KeeThief utility from GhostPack, I tried to extract the master password for the open-source KeePass database…
Full article →
2022.06.01 — F#ck AMSI! How to bypass Antimalware Scan Interface and infect Windows
Is the phrase "This script contains malicious content and has been blocked by your antivirus software" familiar to you? It's generated by Antimalware Scan Interface…
Full article →
2023.02.13 — First Contact: Attacks on Google Pay, Samsung Pay, and Apple Pay
Electronic wallets, such as Google Pay, Samsung Pay, and Apple Pay, are considered the most advanced and secure payment tools. However, these systems are also…
Full article →