简体   繁体   中英

Client IP Address to Closest AWS Region

Question

I would like to upload some data to AWS from a client device, but I'd like to upload to the closest AWS Region's S3 Bucket.

Similarly, I'd like to be able to download from the nearest region.

Of course, I'd set up a bucket in each region

Is there a system that I can use that maybe takes the IP Address of the client, then works out whether it's us-west-1, eu-west-1, eu-central-1, ap-northeast-1 etc?

The crux of the problem is this. The data i'm uploading is useful only to one person and it needs to get to that one person as quickly as possible.

So if I'm in England, I upload a file and my intended recipient is currently in Japan (as they could be on the move) - Uploading to Londons AWS region would have a higher ping time, than of a region closer to Japan.

Route53 latency based routing could help you determine the closest region. However the bucket name will be different in each region, so I'm not sure how you would use this directly with S3.

I think the best option is to place a CloudFront distribution in front of a single S3 bucket. Then your users can automatically upload to the closest CloudFront edge location. https://aws.amazon.com/blogs/aws/amazon-cloudfront-content-uploads-post-put-other-methods/

Find client's location from IP

Use geoip

pip install python-geoip
pip install python-geoip-geolite2

Then your code will look something like this.

from geoip import geolite2
match = geolite2.lookup('8.8.8.8')

print match.location

This produces, (37.386, -122.0838)

Find locations of all AWS centers

The information is available from: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/LocationsOfEdgeServers.html you need to find the geolocations for them. That can be done with geopy

pip install geopy

Then

from geopy.geocoders import Nominatim
geolocator = Nominatim()
location = geolocator.geocode("Singapore")
print location.latitude, location.longitude

Which gives

1.2904527 103.852038

You need to do this for all your locations and save the data somehwere. Possibly in an RDBMS (if you are doing that consider using django, django has excellent support for geolocation searching using GeoDjango)

Finally finding the distance

Having found the client location, let us call it l1, and having found the data center locations, it's time to find the distance

from geopy.distance import great_circle
great_circle(l1.point, l2.point)

And there you have the distance

Finding the closest distance

You could loop through all your saved locations and find the closest distance, or if you saved your data in an RDBMS that supports geospatial data (postgis immidiately comes to mind) you can use the ST_Distance function to do the distance compaison quickly and effectively with very little code. As mentioned earlier, django has excellent support for geospatial queries.

If you were to use Postgis/Django , the loop involving great_circle would be replaced by a call to st_distance.

You could use the "Transfer acceleration" feature that S3 offers (You can enable it in the bucket properties using the AWS console).

Documentation: https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html

You might want to use Transfer Acceleration on a bucket for various reasons, including the following:

  • You have customers that upload to a centralized bucket from all over the world.
  • You transfer gigabytes to terabytes of data on a regular basis across continents.
  • You underutilize the available bandwidth over the Internet when uploading to Amazon S3.

With Boto, you can read the region_name from the session.Session object:

my_session = boto3.session.Session()
my_region = my_session.region_name

The region_name is defined as session.get_config_variable('region')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM