LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Security
User Name
Password
Linux - Security This forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.

Notices


Reply
  Search this Thread
Old 06-17-2017, 03:56 AM   #1
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Rep: Reputation: 45
Best way to store absolutely everything I ever see online from all computers and VM's and in a searchable format


Too often I bookmark a nice site and months later I forget the bookmark title and have to search for the site online. Which does not always succeed in finding the site. Or it takes too much effort and I give up.

What if everything ever seen online, from all VM's and all my computers (CRUCIAL), were stored somewhere safely and made searchable offline?

What's the best way to do it? Must include HTTPS sites too. Could devote a computer to it, for maximum security.
 
Old 06-17-2017, 05:05 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware & Android
Posts: 7,916

Rep: Reputation: 775Reputation: 775Reputation: 775Reputation: 775Reputation: 775Reputation: 775Reputation: 775
As for backing up your PCs, get a big usb drive, and back them up regularly with rsync.

So you want to store a large amount of reference material. To quote the Kerryman, when asked for directions:
"If I was you, I wouldn't start from here at all!"

Most of what we meet online is forgettable. You will certainly drown in data if you store everything. Social media data ages like fresh fruit. For years, I gave a number of public discourses regularly with a scientific component. I grew a ~/Scientific_talks directory, added categorized sub-directories (it currently has 20). I have archived material some of which has been removed from websites over the years. But the directory is only 1.4G because I keep it clean. My ~/pdf, ~/historians, & ~/historical_books are similar but don't have 3G between them. There are a few rules.
* Seek text formats and eschew scans of text in picture format.
* Rename each file to have a meaningful title. For Scientific papers, I use 'Author_Subject' and strip out titles like WnCT000056429.pdf, and meaningless words (definite/indefinite articles, prepositions, sensationalist blurb, etc.).
*Store in Subject based directories descriptively named.
* Clean out at least once per year.
* Keep it up to date on subjects of interest.

This way, a file Manager is enough often to search for something; <find -name> can often get a particular file, and grep finds a word.

I also have a 'Sci_vs_Sci' directory, (From Mad Magazine's Spy versus Spy cartoon strip), where scientists attack each other in print and the other responds. That's common around Evolution. It can be illuminating. A scientist makes 5 points in his paper - all seem good to me; his attacker accepts points 1 & 2, rubbishes 3, and argues against 4 & 5; The rebuttal defends 4 & 5. Therefore, 1 & 2 seem proved, 3 is disproved, and you can view the merits of the case for 4 & 5. Sometimes insights in these exchanges reveal as much as the paper.
 
Old 06-17-2017, 06:07 AM   #3
Jjanel
Member
 
Registered: Jun 2016
Distribution: any&all, in VBox; Ol'UnixCLI; NO GUI resources
Posts: 885
Blog Entries: 10

Rep: Reputation: 291Reputation: 291Reputation: 291
Web-search: cache all|every site visited|viewed
returned 1st: https://www.wired.com/2014/11/fetching-io

Sounds like a reasonable&common desire! Let us know what you choose!
 
Old 06-17-2017, 07:40 AM   #4
Habitual
LQ Addict
 
Registered: Jan 2011
Posts: 8,253
Blog Entries: 11

Rep: Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289Reputation: 2289
I've been known to tell my Firefox to never forget history,
and to search using history and bookmarks and cache.
I open a new tab and start typing and the site I'm interested in shows up.

Otherwise, I'd run http through a proxy and send proxy logs to Kibana,
but that's just me thinking w\out coffee!
 
Old 06-17-2017, 10:16 AM   #5
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Original Poster
Rep: Reputation: 45
Fetching.io has the spirit alright.

History not good enough because a phrase that stays in your mind to study later may not appear in what the history holds (just the titles). You may not realize the importance of a site immediately to bookmark it or save it.

With fetching.io you'd need to search dozens of your VM's and computers one by one to find where it was that you saw a site from. Can't we do better than that? Must do HTTPS sites too.

Last edited by Ulysses_; 06-17-2017 at 10:39 AM.
 
Old 06-17-2017, 10:21 AM   #6
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Original Poster
Rep: Reputation: 45
One caveat, we don't want any site we visit to be able to tell our entire history or bookmarks, as used to happen with some versions of firefox.
 
Old 06-17-2017, 10:25 AM   #7
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Original Poster
Rep: Reputation: 45
And another thing, even web-based chat like disqus could be stored and that would be appreciated too.
 
Old 06-18-2017, 09:24 PM   #8
justmy2cents
Member
 
Registered: May 2017
Location: U.S.
Distribution: Un*x
Posts: 131

Rep: Reputation: Disabled
You can send traffic intercepted from a device like a Pwn Plug R2 over to a remote server via ssh to be processed via the ngrep packet search tool, tshark and wireshark traffic analysis tools, the tcpflow data stream capture tool, the dsniff suite's passive monitoring tools, and tcpxtract for capturing files within internet traffic.. To extract meaningful information from all the content you can use the traffic analysis tools to scan for keywords or look for patterns in data, or data structures. You can also count the repetition of words in a document to get a sense of what the text is about. Also check out Bro IDS...

Last edited by justmy2cents; 06-18-2017 at 09:31 PM.
 
Old 06-19-2017, 08:04 AM   #9
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Original Poster
Rep: Reputation: 45
Talking

Can someone translate into English please.
 
Old 06-19-2017, 01:25 PM   #10
justmy2cents
Member
 
Registered: May 2017
Location: U.S.
Distribution: Un*x
Posts: 131

Rep: Reputation: Disabled
Quote:
Originally Posted by Ulysses_ View Post
Can someone translate into English please.
I saw this on ARS techinca (its similar to what you want I THINK)
 
Old 06-19-2017, 02:33 PM   #11
Ulysses_
Senior Member
 
Registered: Jul 2009
Posts: 1,023

Original Poster
Rep: Reputation: 45
It says "over to a remote server". Can't trust any remote server. Only my own local hardware at home. Can't trust the recommended device either. Searching without indexing maybe only as a last resort. Raw sniffed data would probably need a lot of coding to reformat into something easy to read. Better not re-invent the wheel, someone has probably already done it. This is beginning to sound like spy software. Where is an ex cop when you need them.
 
Old 06-19-2017, 02:44 PM   #12
justmy2cents
Member
 
Registered: May 2017
Location: U.S.
Distribution: Un*x
Posts: 131

Rep: Reputation: Disabled
Quote:
Originally Posted by Ulysses_ View Post
It says "over to a remote server". Can't trust any remote server. Only my own local hardware at home. Can't trust the recommended device either. Searching without indexing maybe only as a last resort. Raw sniffed data would probably need a lot of coding to reformat into something easy to read. Better not re-invent the wheel, someone has probably already done it. This is beginning to sound like spy software. Where is an ex cop when you need them.
I thought I posted the link.. mybad https://arstechnica.com/security/201...ernet-traffic/ but the remote server you can own.. Nevertheless some of the tools would likely help you..

Last edited by justmy2cents; 06-19-2017 at 03:39 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: New PlayStation PSN Web Store blocks Linux computers LXer Syndicated Linux News 0 12-07-2012 04:11 PM
mysql - online store inventory control mattca Programming 6 10-29-2010 06:53 PM
Cost of the OS in store-bought computers pixellany General 20 11-21-2006 08:37 AM
best/cheapest online store for DVD-R spindles? hedpe General 1 12-31-2005 04:29 AM
LXer: Designing an Effective Online Store LXer Syndicated Linux News 0 12-29-2005 03:01 AM


All times are GMT -5. The time now is 07:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration