Skip to content
This repository has been archived by the owner on Jul 21, 2022. It is now read-only.

Commit

Permalink
Merge pull request #8 from yogeshwaran01/Authentication
Browse files Browse the repository at this point in the history
Login into Instagram via session id
  • Loading branch information
yogeshwaran01 authored Feb 11, 2021
2 parents 5472572 + f238161 commit 0c3565a
Show file tree
Hide file tree
Showing 18 changed files with 499 additions and 232 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,7 @@ test/__pycache__
core/__pycache__
instagramy/core/__pycache__
.instagramy_cache
instagramy/plugins/__pycache__
.old
methos.md
instagramy/plugins/__pycache__
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1 +1 @@
include README.md
include README.md
197 changes: 145 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
<!-- headings -->

<h1 align="center"> Instagramy </h1>

<p align="center">Python Package for Instagram Without Any external dependencies</p>

<ul>
</ul>
<!-- Badges -->

<p align="center">
<a href="https://pypi.org/project/instagramy/">
Expand All @@ -12,6 +13,9 @@
<a href="https://pepy.tech/project/instagramy">
<img alt="Downloads" src="https://pepy.tech/badge/instagramy"/>
</a>
<a href="https://github.com/yogeshwaran01/instagramy/stargazers"><img alt="GitHub stars" src="https://img.shields.io/github/stars/yogeshwaran01/instagramy"></a>
<a href="https://github.com/yogeshwaran01/instagramy/network">
<img alt="GitHub forks" src="https://img.shields.io/github/forks/yogeshwaran01/instagramy"></a>
<a href="https://github.com/yogeshwaran01/instagramy/blob/master/LICENSE.txt">
<img alt="GitHub license" src="https://img.shields.io/github/license/yogeshwaran01/instagramy?color=blue"/>
</a>
Expand All @@ -26,9 +30,20 @@
</hr>

<p align="center">
Scrape Instagram Users Informations, Posts Details, and Hashtags details. This Package scrapes the user's recent posts with some information like likes, comments, captions and etc. No login required and any dependencies.
Scrape Instagram Users Information, Posts Details, and Hashtags details. This Package scrapes the user's recent posts with some information like likes, comments, captions and etc. No external dependencies.
</p>

## Features

- It scrapes most of the data of [Instagram user](#Instagram-User-details), [hastags](#Instagram-Hashtag-details) and [Posts](#Instagram-Post-details)
- You can use this package [with login](#Sample-Usage) or [without login](#Use-Without-Login)
- Download [Instagram post](#Plugins-for-Downloading-Posts) and [User profile picture](#Plugins-for-Downloading-Posts)
- Have some [plugins](#Plugins) for Data analysis
- No External dependencies
- Lightweight


<!-- Downloading Guides -->
## Download

### Installation
Expand All @@ -43,93 +58,171 @@ pip install instagramy

```bash

pip install --upgrade instagramy
pip install instagramy --upgrade

```
## Usage

<!-- Usage -->
## Sample Usage

### Getting Session Id of Instrgram

For Login into Instagram via instagramy session id is required. No username or password is Needed. You must be login in same machine to get session id

1. Login into Instagram in default webbrowser
2. Move to Developer option
3. Copy the sessionid
- Move to storage and then to cookies and copy the sessionid (Firefox)
- Move to Application and then to storage and then to cookies and copy the sessionid (Chrome)

**Note:** Don't use your session id from other machine's browser. It must be in current local machine.

<img src="./samples/sessionid.gif" width=100% height=100%>



### Instagram User details

Class `InstagramUser` scrape some of the information related to the user of the Instagram

#### Properties

- biography
- fullname
- is_private
- is_verified
- number_of_followers
- number_of_followings
- number_of_posts
- other_info
- posts
- posts_display_urls
- profile_picture_url
- username
- website
```python
>>> from instagramy import InstagramUser

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> user = InstagramUser('google', sessionid=session_id)

<img src="https://raw.githubusercontent.com/yogeshwaran01/instagramy/master/samples/user.png" width=100% height=100%>
>>> user.is_verified
True

>>> user.biography
'Google unfiltered—sometimes with filters.'

>>> user.user_data # More data about user as dict
```

### Instagram Hashtag details

Class `InstagramHashTag` scrape some of the information related to the hash-tag of the Instagram

#### Properties
you can set your sessionid as env variable

```bash
$ export SESSION_ID="38566737751%3Ah7JpgePGAoLxJe%er40q"
```

```python
>>> import os

>>> from instagramy import InstagramHashTag

- number_of_posts
- posts_display_urls
- profile_pic_url
- tagname
- top_posts
>>> session_id = os.environ.get("SESSION_ID")

<img src="https://raw.githubusercontent.com/yogeshwaran01/instagramy/master/samples/hashtag.png" width=100% height=100%>
>>> tag = InstagramHashtag('google', sessionid=session_id)

>>> tag.number_of_posts
9556876

>>> tag.tag_data # More data about hashtag as dict
```

### Instagram Post details

Class `InstagramPost` scrape some of the information related to the particular post of Instagram. It takes the post id as the parameter. You can get the post id from the URL of the Instagram posts from the property of `InstagramUser.posts`. or `InstagramHagTag.top_posts`

#### Properties
```python
>>> from instagramy import InstagramPost

- author
- caption
- description
- number_of_comments
- number_of_likes
- post_detail
- uploaded_date
>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

<img src="https://raw.githubusercontent.com/yogeshwaran01/instagramy/master/samples/post.png" width=100% height=100%>
>>> post = InstagramPost('CLGkNCoJkcM', sessionid=session_id)

## Caching
>>> post.author
'ipadpograffiti'

From version 4.1 it does some caching process for avoid some errors. Due to caching the package may return old data of users or hashtags.
To avoid this just use agrument `from_cache=False` or delete the hidden folder `.instagramy_cache`
>>> post.number_of_likes
1439

```python
>>> post.post_data # More data about post as dict

```

### Plugins

Instagramy has some plugins for ease

from instagramy import InstagramUser
#### Plugins for Data Analyzing

user = InstagramUser('github', from_cache=False)
- analyze_users_popularity
- analyze_hashtags
- analyze_user_recent_posts

```python
>>> import pandas as pd
>>> from instagramy.plugins.analysis import analyze_users_popularity

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> teams = ["chennaiipl", "mumbaiindians",
"royalchallengersbangalore", "kkriders",
"delhicapitals", "sunrisershyd",
"kxipofficial"]
>>> data = analyze_users_popularity(teams, session_id)
>>> pd.Dataframe(data)

Usernames Followers Following Posts
0 chennaiipl 6189292 194 5646
1 mumbaiindians 6244961 124 12117
2 royalchallengersbangalore 5430018 59 8252
3 kkriders 2204739 68 7991
4 delhicapitals 2097515 75 9522
5 sunrisershyd 2053824 70 6227
6 kxipofficial 1884241 67 7496
```

## ⚠️ Note
#### Plugins for Downloading Posts

- download_hashtags_posts
- download_post
- download_profile_pic

- Don't send multiple requests, the Instagram redirects to the login page, If you send multiple requests, reboot your pc or change the IP or try after sometimes.
- This Package does not work in Remote PC or any Online python Interpreter.
- This Package does not scrap all the posts from an account, the limit of the post only 12 (For non-private account)
- This Package not scrap all the posts of given hash-tags it only scrapes the top 60 - 70 posts .
```python
>>> import os

>>> from instagramy.plugins.download import *

>>> session_id = os.environ.get('SESSION_ID')

>>> download_profile_pic(username='google', sessionid=session_id, filepath='google.png')

### Sample-Scripts
>>> download_post(id="ipadpograffiti", sessionid=session_id, filepath='post.mp4')

Some sample scripts based on this package
>>> download_hashtags_posts(tag="tamil", session_id=session_id, count=2)
```

### Use Without Login

- [👦 Download Instagram DP](https://github.com/yogeshwaran01/Python-Scripts/blob/master/Scripts/instadp.py)
You can use this package without login. Sessionid is not required but it may rise error after four to five requests.

- [📊 Analysis Instagram Accounts with Matplotlib](https://github.com/yogeshwaran01/Python-Scripts/blob/master/Scripts/instalysis.py)
```python
>>> from instagramy import *

- [#️⃣ Bulk Instagram Hashtag Posts Download](https://github.com/yogeshwaran01/Python-Scripts/blob/master/Scripts/instagram_hastags_post.py)
>>> user = InstagramUser('google')
>>> user.fullname
'Google'
>>> tag = InstagramHashTag('python')
>>> tag.tag_data
```

## ✏️ Important Notes

- You can use this package without sessionid (Login). But it may `RedirectionError` after four to five requests.
- class `Viewer` provide the data about currently logged in user
- Don't provide wrong session_id
- `InstagramUser.user_data`, `InstagramPost.post_data` and `InstagramHashtag.tag_data` which is python `dict` has more and more data other than defined as `Properties`
- This Package does not work in Remote PC or any Online python Interpreter.
- This Package does not scrap all the posts from an account, the limit of the post only 12 (For non-private account)
- This Package not scrap all the posts of given hash-tag it only scrapes the top 60 - 70 posts.

<h3 align="center"> Made with Python ❤️ </h3>
41 changes: 24 additions & 17 deletions instagramy/InstagramHashTag.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@
>>> tag.top_posts
"""
from datetime import datetime
from collections import namedtuple

from .core.parser import ParseHashTag
from .core.cache import Cache
from .core.requests import get
from .core.parser import Parser
from .core.parser import Viewer
from .core.exceptions import HashTagNotFound
from .core.exceptions import RedirectionError
from .core.exceptions import HTTPError
from .core.requests import get


class InstagramHashTag:
Expand All @@ -33,27 +36,28 @@ class InstagramHashTag:
>>> instagram_user.posts_display_urls
"""

def __init__(self, tag: str, from_cache=True):
def __init__(self, tag: str, sessionid=None):
self.url = f"https://www.instagram.com/explore/tags/{tag}/"
if from_cache:
cache = Cache("tag")
if cache.is_exists(tag):
self.tag_data = cache.read_cache(tag)
else:
self.tag_data = self.get_json()
cache.make_cache(tag, self.tag_data)
self.sessionid = sessionid
data = self.get_json()
try:
self.tag_data = data["entry_data"]["TagPage"][0]["graphql"]["hashtag"]
except KeyError:
raise RedirectionError
if sessionid:
self.viewer = Viewer(data=data["config"]["viewer"])
else:
self.tag_data = self.get_json()
self.viewer = None

def get_json(self) -> dict:
"""
Return a dict of Hashtag information
"""
try:
html = get(self.url)
html = get(self.url, sessionid=self.sessionid)
except HTTPError:
raise HashTagNotFound(self.url.split("/")[-2])
parser = ParseHashTag()
parser = Parser()
parser.feed(html)
return parser.Data

Expand Down Expand Up @@ -95,9 +99,11 @@ def top_posts(self) -> list:
except (KeyError, TypeError):
data["is_video"] = None
try:
data["timestamp"] = node["node"]["taken_at_timestamp"]
data["upload_time"] = datetime.fromtimestamp(
node["node"]["taken_at_timestamp"]
)
except (KeyError, TypeError):
data["timestamp"] = None
data["upload_time"] = None
try:
data["caption"] = node["node"]["accessibility_caption"]
except (KeyError, TypeError):
Expand All @@ -116,7 +122,8 @@ def top_posts(self) -> list:
data["display_url"] = node["node"]["display_url"]
except (KeyError, TypeError):
data["display_url"] = None
post_lists.append(data)
nt = namedtuple("Post", data.keys())(*data.values())
post_lists.append(nt)
return post_lists

@property
Expand Down
Loading

0 comments on commit 0c3565a

Please sign in to comment.