Delivery-Date: Fri, 13 Feb 2015 19:53:03 -0500
Return-Path: <tor-talk-bounces@lists.torproject.org>
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on moria.seul.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,URIBL_BLOCKED autolearn=ham
	version=3.3.1
X-Original-To: archiver@seul.org
Delivered-To: archiver@seul.org
Received: from eugeni.torproject.org (eugeni.torproject.org [38.229.72.13])
	(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by khazad-dum.seul.org (Postfix) with ESMTPS id A00B11E0F88
	for <archiver@seul.org>; Fri, 13 Feb 2015 19:53:01 -0500 (EST)
Received: from eugeni.torproject.org (localhost [127.0.0.1])
	by eugeni.torproject.org (Postfix) with ESMTP id 4A9BD3347B;
	Sat, 14 Feb 2015 00:52:58 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by eugeni.torproject.org (Postfix) with ESMTP id 78C833345D
 for <tor-talk@lists.torproject.org>; Sat, 14 Feb 2015 00:52:54 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at 
Received: from eugeni.torproject.org ([127.0.0.1])
 by localhost (eugeni.torproject.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 6Z1fwlXSyTlx for <tor-talk@lists.torproject.org>;
 Sat, 14 Feb 2015 00:52:54 +0000 (UTC)
Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com
 [IPv6:2607:f8b0:4001:c05::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (not verified))
 by eugeni.torproject.org (Postfix) with ESMTPS id 4B01432924
 for <tor-talk@lists.torproject.org>; Sat, 14 Feb 2015 00:52:54 +0000 (UTC)
Received: by mail-ig0-f174.google.com with SMTP id b16so14107340igk.1
 for <tor-talk@lists.torproject.org>; Fri, 13 Feb 2015 16:52:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virgil.gr; s=dkim;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :content-type; bh=eYMo52bDjmMtKOyL7m96lT0WEy4EcOCAoX/rFfUnFMU=;
 b=AH24OuJXJRyiWOyO+nQMEASkKF0EHh1lYqii21ckyWxLtHs8Bu8/yd9lsb2Hx2dbVM
 0kbbDJLjkC5JBhcCaVZ5egmEIiCUHIGb50b84uRPcD1Wn/+1OePwDgwoerteBV5zNgMy
 Mi1w9B1PsC/QMDI91N+gosRbq7MEbHMAxVDFk=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:content-type;
 bh=eYMo52bDjmMtKOyL7m96lT0WEy4EcOCAoX/rFfUnFMU=;
 b=Vp1yRpLQQ1vJoya0fwOsEhjAM0D03jZAGigMvFwF9Y7D2laWGCtYARdYQvpb2q4pmy
 a+g482SMqyvFFM4J0IeZgXVN84kHOMiGS4M0QwMetx41DOdCdRFtlbqgigrYi7pIlIhq
 pZAIJTn7pXWmHWPHRK3hJArag4xQPimInXwc77YXphfg2acBVnq3ZIYZDbgZ+f7S5HVK
 lY+9nE49m/CbND+jcUdFN4raX0DK/qrITD/gSrRNYERmUggbny1u8Lgo/pATVgTdJJpG
 aLb5bxLdLfZadyTsHcKpYW+9Ju9EIltFt8x+puoWRGApCj7NGKtdBvKGLJdIFtTG5BPW
 QE+g==
X-Gm-Message-State: ALoCoQnY0behEtmUp5qnoPa9ApsaF8o4Xl7r+ILaybGU6Er/Rt9EZI+X7PQq5wN1n/eOqxetSKVN
X-Received: by 10.107.131.224 with SMTP id n93mr15699809ioi.66.1423875172011; 
 Fri, 13 Feb 2015 16:52:52 -0800 (PST)
MIME-Version: 1.0
Received: by 10.50.111.43 with HTTP; Fri, 13 Feb 2015 16:52:31 -0800 (PST)
In-Reply-To: <20150213233054.AEB08C040C@smtp.hushmail.com>
References: <CADop2NFwg+mViiRWWZWYQo4S8m+d87U10LrvAsQZmuGHTBOGjQ@mail.gmail.com>
 <87r3ttpz37.fsf@riseup.net>
 <CADop2NEFxHz1U=5_r4Zffj9Z0oa+EW1QOG61cQoMJMR8NgqCjA@mail.gmail.com>
 <20150213233054.AEB08C040C@smtp.hushmail.com>
From: Virgil Griffith <i@virgil.gr>
Date: Fri, 13 Feb 2015 16:52:31 -0800
Message-ID: <CADop2NEhTvY4CGwKHU+j6xLTkpz5As0WqOjKZyLo-9MJ901UPw@mail.gmail.com>
To: "tor-talk@lists.torproject.org" <tor-talk@lists.torproject.org>
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: Re: [tor-talk] Funded search engine for onionspace?
X-BeenThere: tor-talk@lists.torproject.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: tor-talk@lists.torproject.org
List-Id: "all discussion about theory, design,
 and development of Onion Routing" <tor-talk.lists.torproject.org>
List-Unsubscribe: <https://lists.torproject.org/cgi-bin/mailman/options/tor-talk>, 
 <mailto:tor-talk-request@lists.torproject.org?subject=unsubscribe>
List-Archive: <http://lists.torproject.org/pipermail/tor-talk/>
List-Post: <mailto:tor-talk@lists.torproject.org>
List-Help: <mailto:tor-talk-request@lists.torproject.org?subject=help>
List-Subscribe: <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk>, 
 <mailto:tor-talk-request@lists.torproject.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: tor-talk-bounces@lists.torproject.org
Sender: "tor-talk" <tor-talk-bounces@lists.torproject.org>

Leeroy, to avoid being indexed by Googlebot et al, place the appropriate
/robots.txt at your root.  It's described in the FAQ.

http://www.onion.city/faq.html

As a historical note, the reason Aaron and I chose Tor2web's URL design was
so search engines would automatically see any /robots.txt an onionsite
specifies.


-V

On Fri, Feb 13, 2015 at 3:30 PM, l.m <ter.one.leeboi@hush.com> wrote:

>
> >Alas no.  I'm aware this is suboptimal.  I see GOOG search engine as
> a
> >temporary-ladder just to get the ball rolling.  I am open to using
> any
> >other index.  For what it's worth I'm very pleased with GOOG's
> >performance---right now it's searching an index of 650k onion pages
> and the
> >number grows every day.
>
> If you instead use a google search appliance couldn't you use google
> engine for indexing without having to use google itself? Wouldn't that
> also avoid the problem of google queries being associated with the
> client making the request?
>
> >Although we technically could read provided passwords, we don't keep
> logs
> >of passed traffic.  However, I understand that many users don't
> understand
> >the tor2web threat model.  But this is the same as all Tor2web nodes,
> yes?
> >This is not at all unique to OnionCity.  As far as I know all Tor2web
> nodes
> >allow form submissions.
>
> What is unique to onion.city is that access to someonion.onion.city
> occurs using http and doesn't redirect to the .onion if Tor is in use.
> That the tor2web mirror might snoop is implicit--that the exit (if
> using tor) might also snoop is more of a concern.
>
> >You mentioned it'd be better to have it randomly pick among the
> available
> >Tor2web nodes instead of everything going through OnionCity.  This
> breaks
> >the GOOG search engine which only wants to return "canonical" URLs.
> We
> >could talk about making OnionCity a DNS round-robin akin to how
> Tor2web.org
> >currently works, but then I'm just replicating Tor2web.
>
> The ability of tor2web to provide mirrors should be optional. If you
> only know one mirror and that mirror cannot service the request then
> how are you going to get any of the other mirrors? Google engine can
> return related addresses in an order based on the success of loading
> the mirror itself. If onion.city always works it will tend to precede
> tor2web.org. If onion.city goes down (having search front-end separate
> from tor2web mirror) the search engine can reorder the result to
> improve the success of the first click.
>
>   >Right now I aggregate existing lists of onion sites and put them
> into the
> >site map.
>   >* https://ahmia.fi/onions/
>   >* http://skunksworkedp2cg.onion.city/sites.txt
>   >* http://xlmvhk3rpdux26dz.onion.city/
>   >* http://kkkkkku5juzqh33a.onion.city/
>
> If google is itself handling the indexing won't that cause a problem
> for sites in those lists, which are normally okay with being indexed,
> just not by googlebot? I for one couldn't care less about being
> indexed by ahmia.fi but it'll be a cold day in hell before I let
> googlebot. Precisely because of how easy it is to link the search to
> the requester.
> --leeroy
> --
> tor-talk mailing list - tor-talk@lists.torproject.org
> To unsubscribe or change other settings go to
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk
>
-- 
tor-talk mailing list - tor-talk@lists.torproject.org
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk

