以下は下記原文 Thu Oct 5 17:04:32 2006 UTC 時点の 翻訳 草稿 です。 転記などはご遠慮ください。 オリジナルの知的財産権は University of California に属します。
原文: A new major SETI project based on Project Serendip data and 100,000 personal computers
最終更新時刻 20:39:33, 2006年11月30日(JST)

プロジェクト Serendip のデータ と 100,000 台の PC による新しくて大がかりな SETI プロジェクト

logo7

翻訳準備中

以下の1997年の論文が発表されたのは下記のコンファレンスでした。

"Astronomical and Biochemical Origins and the Search for Life in the Universe", Proc. of the Fifth Intl. Conf. on Bioastronomy = IAU Colloq. No. 161, eds. C.B. Cosmovici, S. Bowyer, and D. Werthimer (Publisher: Editrice Compositori, Bologna, Italy)


A new major SETI project based on Project Serendip data and 100,000 personal computers
プロジェクト Serendip のデータ と 100,000 台の PC による新しくて大がかりな SETI プロジェクト

W. T. Sullivan, III (U. Washington), D. Werthimer, S. Bowyer, J. Cobb (U.California, Berkeley), D. Gedye, D. Anderson (Big Science, Inc.)

概要

著者らは革新的な SETI プロジェクトを開発中です。 その暫定的な名前は seti@home であり、 この世界中にばらまかれたデスクトップ・コンピュータの上で、大規模な並列計算をします。 このプロジェクトでは、一般公衆が本当に科学的なプロジェクトに参加するという特徴をもちます。 各個人参加者がダウンロードすることになるスクリーンセイバーのプログラムは、 参加者のコンピュータが暇になったときに見栄えのするグラフィクスを表示するというところは 普通のスクリーンセイバーと同じです。 しかしそれだけではなく、 その計算機を使って SETI データに大する精緻な分析も実行します。 このデータは、305m 口径のアレシボ電波望遠鏡の上で運用されている SETI 探索において、 プロジェクト Serendip IV の受信機から読み出したものを使います。 我々は 波長 21cm の HI 吸収線を中心とした帯域幅 2MHz の信号を、継続的にテープに記録します。 このテープに記録されたデータは、サーバで予備的に篩にかけた後、 小さな塊(大きさは0.25MBで、20kHzの帯域をもつ 50秒間の信号を含む) に分けてから、 前述のスクリーンセイバーを実行しているクライアント群へ インターネットを介してサーバから供給されます。 そのクライアント・コンピュータは、1つのデータの塊を自動的に分析した後、 (この分析は Serendip が通常行う処理より格段に詳細なものです)、 最も有望な候補信号群についてサーバに報告します。 この報告と同時に新しいデータの塊がクライアントへ送られます。 もし、50,000 から 100,000 人の参加者が得られれば、 この仕組みによる計算能力は典型的なスーパコンピュータの数分の1という大きさにまで達します。 そして、seti@home は 探索するパラメタ空間の広さにおいて、Serendip IV に比肩できるようになります。

Introduction

Science, although almost totally supported by public funds, has traditionally been carried out in laboratories and observatories not open to the general public. In an era when the public's support of science is wavering, this modus operandi may be self-defeating and requires re-examination. The goal of the present SETI project, tentatively named seti@home, is (a) to do good science, and (b) to do it in a way that engages and excites the general public. This is a chance to educate participants about how science works, as well as to give them reliable information about SETI (as opposed to, for example, the film Independence Day). In the end the scientific community can only profit if the public better understands the scientific enterprise.

Once operational, seti@home will:

Project Architecture and Data Flow

Seti@home is a "piggyback" survey based on the Serendip IV survey, which itself is a piggyback survey operating on the 305-m Arecibo telescope. Serendip IV, which will begin operations on the newly upgraded Arecibo dish in 1997, is described (along with its predecessor) by Bowyer et al. (1997) elsewhere in this volume. The basic idea is that a separate Serendip receiver and data processor "rides around" on the Arecibo feed platform as normal radio astronomy is carried out. The sky visible to Arecibo is thus surveyed in a pseudo-random fashion, and in fact any given patch of sky is typically revisited every 3-6 months. These revisits are critical for discriminating aginst manmade radio-frequency interference (RFI).

The overall architecture of seti@home is presented in Figure 1. At Arecibo seti@home will tape-record at baseband a small portion of Serendip's total bandwidth of 183 MHz; 2 MHz will cover the 3 possible velocity rest frames of heliocentric, galactocentric, and cosmic background radiation. This band will have been down-converted from an observing frequency centered on the 21 cm hydrogen line at 1420 MHz, ideal in terms of SETI strategy and freedom from manmade interference (RFI). With one-bit sampling at the Nyquist rate, this is a data-recording rate of 0.5 MB/sec, or one Exabyte Mammoth DAT (25 GB capacity) every 11 hours, which is 500 per year (for an expected 70% observing efficiency). These tapes are Fed-Ex'ed to the Big Science server where they are validated, archived, and supplied with indexing parameters, in particular sky positions and times. Only a portion of the data is ever analyzed, in the first instance whatever is necessary to keep customers supplied (see below). This portion of the data will be chosen so as to be at the lowest possible galactic latitudes, i.e., in the direction of the galactic plane as in the "Milky Way strategy" put forward by Sullivan and Mighell (1984).

Figure 1
Figure 1. A schematic diagram of the architecture and data flow of seti@home.
Note that "N" refers to the number of computers currently participating in the analysis.

Even if no signal appears on the first look at a given sky position, sky positions that are (by chance) repeated are given the highest priority for analysis because they allow one to combat RFI and to treat the possibility of interstellar scintillations. Candidate signals are assigned higher IQ's (Interest Quotients) according to how many times they repeat and how distinct they look from possible RFI in terms of their fitting the beam pattern and exhibiting a non-local Doppler drift.

Each customer will work at any time on 0.25 MB of data, which represents 50 seconds of a 20 kHz signal. These 20 kHz bandwidths will be created from the original 2 MHz data at the server by lookup-table FFT's. This 50 sec of drift data corresponds to a sky swath of size 6' x 25' (4 independent beams, essential for RFI discrimination). This means that each person has a 50 sec x 20,000 Hz array as his/her basic data set; code and data storage requirements to analyze this chunk are estimated at 3-4 MB. On the receiving end a 0.25 MB chunk will require 1.3 sec on an incoming T1 line of 190 kB/s, or 2.3 minutes on a 14.4K baud line (sufficiently short not to be discouraging to customers on phone lines). Upon completion (typically after several days) of the data analysis for each chunk, a short message reporting candidate signals is presented to the customer and also returned to Big Science and to the University of Washington for post-processing (see below).

On the server end, a T1 line can handle 90 GB per week and thus 360,000 customers could be serviced weekly (or 180,000 twice per week) with one outgoing line. This 90 GB represents 30% of a fully-recorded week at Arecibo, or 43% of a more realistic week (at 70% efficiency).

Signal Analysis @home

The uniqueness and advantage of seti@home from a SETI point of view is the depth of signal analysis that will be possible on each chunk of bandwidth x time (and sky position). (N.B.: The customer will not have any control over the nature of the search; the home computer's CPU cycles are simply being rented by an automatic program.) It is proposed to look for signals of widths ranging from 0.1 Hz (a 2 x 105 point FFT) to 2000 Hz. (Serendip now searches for lines of width 0.6 to 640 Hz; the partial redundancy between the two searches will be a useful cross-check.) All signals will be sought for each of 40 assumed distinct Doppler drift rates (during the 12 sec pass time through one beam); anything that has zero drift with respect to Puerto Rico will automatically be rejected as RFI, but all nonzero drift rates will be retained, especially those matching the earth's rotational and orbital acceleration (of order 0.5-1.5 Hz over 12 sec; the actual value at the time of observation will be calculated by the server and tagged on the data chunk). These data also allow searching for intermittent or pulselike signals with durations anywhere from 0.5 msec to 10 sec - this is one of the main advantages of seti@home over Serendip, which can only detect pulses with a spacing greater than 1.7 sec. Any signals lasting for less than a beamwidth might be burst-like RFI, but will be retained (albeit with lower weight). The most viable candidates, however, must also match the beam-shape, for this is the best criterion for distinguishing a real point-source signal from RFI. Thus a Gaussian beam correlation filter (with a width of 12 sec) will also be applied to all candidate signals. In summary, candidate signals will be reported back to the server on the basis of their exhibited signal/noise ratio for the best-fit bandwidth, timewidth, frequency channel, frequency drift, and beam shape matching.

Post-Processing

Post-processing at the University of Washington will build up a database of all candidate signals reported back from the customer computers. This database will be used to monitor progress with the survey, monitor data quality and the RFI environment at Arecibo, and decide if any changes in survey or operational procedures may be desirable. With the extremely high number of cases that we will be examining, we expect that RFI will often by chance behave in such a manner that it mimics an extraterrestrial signal. Thus it is extremely important that any high-signal/noise candidates observed once are only considered highly tentative until positively confirmed by other (random) passes (usually 3-6 months later) of the beam over the same patch of sky. The post-processing will monitor for these subsequent passes and compare the consistency of the characteristics of the candidate signals for each pass. In this way a set of best candidate signals will be accumulated, to be checked later with follow-up, dedicated Arecibo time.

Post-processing will also search for characteristic regularities of the reported signals as members of a group. One example are frequency multiplets, namely pairs or n-tuplets of arbitrarily, but equally spaced narrowband signals. For the advantages of such signals for SETI, see Cohen (1994) and Cordes and Sullivan (1994).

It should not be forgotten that there exists the tantalizing possibility of serendipitously discovering, as a byproduct of this SETI analysis, a wholly new type of naturally occurring astrophysical phenomenon. The overall search analysis is the same except that natural signals could be broader than a single beam and are therefore that much trickier to distinguish from RFI. For candidates smaller than a beam, however, the distinction between an intelligent or natural origin can only come later after follow-up observations and analysis.

The Screensaver Itself

The program that runs on each client computer looks and behaves like a captivating screensaver. It runs only when the machine is idle, and the user can choose from several different colorful and dynamic "visualizations" of the SETI process. Some of these visualizations will look technical, some will look abstract, and some will look decidedly artistic. We will provide a core set of visualizations, as well as a plug-in mechanism so that others can easily be added. Standard screensaver modes will include (1) a map of the world showing the location of all machines currently participating in the project, (2) a map of the sky showing what areas have been covered by the survey and the location of the patch of sky currently being analyzed (with the option of viewing classical mythological figures for the constellations - one might, for instance, see that one's patch is in the armpit of Orion!), (3) colorful, changing patterns that correspond to the Fourier transforms currently being undertaken, and (4) "straight" graphs showing results of the currently evolving data analysis.

Closing

As of September 1996 seti@home is only a proposal, but we feel certain that it will happen soon. The potential customers for seti@home include astronomy and space enthusiasts (e.g., members of The Planetary Society), science fiction fans (e.g., "Trekkies"), Internet adventurers ("netheads"), science teachers and their students, and other science & technology enthusiasts. All we need to make this global participatory project a reality is sponsorship by a visionary, high-tech corporation. The project is good science and it will generate great publicity and goodwill for any sponsor, as well as for science in general and bioastronomy in particular. Any takers?


REFERENCES

Copyright © 2010 University of California, Translated by JE2BWM