Inventors listed include AT&T Labs Research employees only. For a full list of inventors, click on patent number.

Broadband switches and routers must accommodate the diverse traffic parameters and quality-of-service requirements of voice, data, and video flows. End-to-end performance guarantees depend on flows complying with traffic contracts, such as an average transmission rate and a maximum burst size, at the edge of the network. Traffic shaping devices are used to enforce these contracts for a set of flows entering or leaving the network on a particular link, ensuring that only conforming packets can enter the network. These devices must operate at very high speeds, while arbitrating link access for thousands or even tens of thousands of flows with different traffic contracts. Shapers handling a large number of active flows often have transient backlogs of packets that are eligible for transmission.In contrast to previous shaper designs, the invention arbitrates fairly between flows with conforming packets by carefully integrating traffic shaping with rate-based scheduling algorithms. Through a careful combination of per-flow queuing and approximate sorting, the shaper performs a small, bounded number of operations in response to each arrival and departure, independent of the number of flows and packets. This is crucial to operating on high-speed links, where the shaper may not have time to perform more than a handful of memory accesses in response to each incoming or outgoing packet. When the shaper must handle a wide range of bandwidth parameters, a hierarchical arbitration scheme can reduce the implementation overheads and further limit interference between competing flows. The invention limits shaping delay and traffic distortions, even in periods of heavy congestion. The efficient combination of traffic shaping and link scheduling results in an effective architecture for managing buffer and bandwidth resources in large, high-speed ATM switches.
A system and method are described for assigning downstream and upstream communication pathways in broadband access networks, such as networks based on the hybrid fiber-coax architecture. In the upstream direction, frequency channel and time slot assignments use a packing opposite methodology, which is dependent on the type of service request. The system assigns the lowest time slot in the lowest frequency available for a DSO service request and the highest time slot in the highest frequency which has the next four contiguous time slots available for an H0 service request. To optimize upstream bandwidth efficiency, a H0 channel occupying the space of four contiguous time slots is used to carry up to six simultaneous voice conversations. A method is described that governs such use of H0 channels by NIUs according to their offered loads. In the downstream direction, network interface units are automatically assigned frequencies in response to the expected load across the available frequency channels and are made to spread evenly the load across all downstream frequency channels. In one embodiment, assignments are based upon each individual network interface unit load, each network interface unit group load, the capacity of each network interface unit group and the blocked load. An assignment will be made to the group having the maximum idle capacity, i.e., the group which has the greatest blocked load differential. That is, assignments are made if moving an individual NIU from a first NIU group to a second NIU group results in a blocked load differential greater than the cost ratio R, where R represents the cost of moving an NIU divided by the cost.
Many email messages are part of conversational transactions and require a response. In many cases, the set of possible responses can be anticipated by the sender (e.g., Yes or No,Pizza or Pasta, etc.). In these cases, structuring the email transaction provides benefits to both the sender and the recipients. Senders get the benefit of tabulated responses, and recipients get a more convenient way to respond. Our system provides a web-based composition tool for senders, a central store of messages and responses, and pointer messages that lead recipients to a virtual mailbox on the web that features form-based responses, pre-structured by the sender. The message a recipient reads may vary based on preceding responses; by storing the messages centrally, recipients always see the current message (6 out of 8 can attend so far).
Strudel is a system for specifying and generating data-intensive Web sites that separates the tasks of accessing and integrating a site's data sources, building its structure, and generating its HTML representation. For the first two tasks, Strudel uses a novel declarative query language called StruQL, which extends SQL with the ability to construct new, richly structured graphs. Strudel helps designers of data-intensive Web sites built modular, reusable site-definition code, and supporting better site engineering tasks. Strudel was applied to a production Web site inside AT&T, the High Toll Notifier Web site, and resulted in a smaller, more reusable, analyzable and optimizable code.
The invention is a system and method for database compression which creates partial indexing into compressed sub table blocks of databases. Table rows with the same or related indexing parameters are grouped as sub-table blocks and are stored as compressed binary objects, with the indexing fields stored in the same row, external to the binary block. The binary object expands to multiple database rows when accessed via the sub table block interface, thus forming a hierarchical, pre-joined database organization. Mechanisms are provided for creating, accessing, and manipulating the data blocks, and a date-based versioning mechanism. The compression employed is the known Vdelta package, which operates at a byte level to provide a useful compromise between speed and compression efficiency, even for relative short compression blocks. In realistic tests, the I/O time gained through compression results in a time saving which exceeds the processing penalty. The overall compression ratio is data dependent, but in a realistic test it averages about 4.
An enduser at a POTS analog voice-only endpoint (136) and endusers at H.320 standard multimedia terminals (101, 102, 103, 104), which each communicate over separate voice, video and data streams, engage in a videoconference with each other in a pseudo multimedia manner through a central platform (135) that provides call conversion capabilities. A document to be shared by a user at the POTS endpoint with users at the multimedia endpoints is transmitted as a data signal from a facsimile machine (137) or PC terminal (138) associated with the POTS user to a server (146) in the platform. The received data signal is then inputted to a multimedia bridge (124) and transmitted on the data stream to each multimedia endpoint for display on a window on each multimedia terminal. Similarly, a document to be shared by a multimedia endpoint is transmitted on a data stream to the multimedia bridge, where it is bridged on the data stream transmitted to the other multimedia endpoints and to the server. The document is then transmitted from the server to the facsimile machine or PC terminal associated with the POTS endpoint. In conventional multimedia conferencing arrangements, voice-activated switching is used to determine which user's video image is bridged onto the video stream transmitted to each multimedia terminal. When the audio signal from the POTS user would cause a video signal from that user's terminal to be bridged to all the multimedia endpoints if in fact that user was at a multimedia terminal, a stored image of that user is retrieved from a database (151) and outputted by the bridge on the video stream transmitted to each multimedia terminal to enable the multimedia participants to visually identify the presently talking enduser.
A laptop has an integrated telephone, in which the telephone and mouse unit are arranged in a manner that allows the mouse to be adjusted to either side of the laptop computer to accommodate either a left-handed or right-handed person. A telephone is tightly integrated into the body of a laptop PC creating a much more natural and ergonomic physical interface between the phone and the computer. The resulting device includes a mounting for the mouse module, allowing ease of use by both right-handed and left-handed users. The laptop computer includes a recessed storage area for the small telephone handset. The storage area is located in the area just below the keyboard, i.e., the area where the user's wrists usually lie. The telephone handset and associated cable reside in the tray. The mouse module slides along a guide way in the tray, and can be positioned on either side of the guide tray/telephone handset storage tray. The benefits of the above tight integration of the laptop PC and the telephone are many. First, one's laptop becomes one's telephone console. Second, the resulting telephone has local user programmable processing and a large storage area. Furthermore, the telephone has a large high resolution display. Moreover, integrating an IP telephone into the laptop allows communication over the same network IP links that the laptop communicates over normally. Finally, this does not preclude the inclusion of a standard PSTN telephone in the laptop.
A new architecture capable of utilizing the existing twisted pair interface between the customer services equipment and the local office is used to provide a vast array of new services to customers. Using an intelligent services director (ISD) at the customer services equipment and a facilities management platform (FMP) at the local office, new services such as simultaneous, multiple calls (voice analog or digital), facsimile, Internet traffic and other data can be transmitted over the existing single twisted pair using xDSL transmission schemes. New services such as the implementation of Internet connectivity, videophone, utility metering, broadcasting, multicasting, bill viewing, information pushing in response to a user profile, directory look-up and other services can be implemented via a network server platform via this architecture. A network server platform for hosting a plurality of services comprises, for example, a memory for storing a user profile, the user profile containing interests of a user, and for storing information related to their interests and a controller for controlling the collection of information from information servers and for pushing the collected information to the user in accordance with their defined priority.
A computer system in which resources are selected or purged based on extremes of utilization (i.e., by virtue of having either the highest or lowest utilization ranking in a group of resources) effectively handles newly added resources whose initial utilization rankings do not reflect their true popularity (or lack thereof). New resources are arbitrarily assigned an initial utilization ranking that is in the middle of the range of utilization rankings--e.g., preferably the median or, more preferably, the mean--of other members of the group. Therefore, the new resource is not immediately either selected or purged, because its initial utilization ranking is not at either extreme, and eventually, after not too long an interval, the utilization ranking reflects the true conditions, and the resource will be selected or purged, or not, as the statistics may warrant. In the meantime, other resources in the group are selected or purged based on their actual statistics which reflect the true conditions closely enough.
A method and apparatus are provided for compliance checking in a trust-management system A request r, a policy assertion (.function..sub.0, POLICY), and n-1 credential assertions (.function..sub.1, s.sub.1) , . . . , (.function..sub.n-1, s.sub.n-1) are received, each credential assertion comprising a credential function .function..sub.i and a credential source s.sub.i. Each assertion may be monotonic, authentic, and locally bounded. An acceptance record set S is initialized to {(.LAMBDA., .LAMBDA., R)}, where A represents a distinguished null string, and R represents the request r. Each assertion (.function..sub.i, s.sub.i), where i represents the integers from n-1 to 0, is run and the result is added to the acceptance record set S. This is repeated mn times, where m represents a number greater than 1, and an acceptance is output if any of the results in the acceptance record set S comprise an acceptance record (0, POLICY, R).
In a method for reducing latency in packet telephony caused by anti-jitter buffering, audio data elements are received and placed in a telephony input buffer used for anti-jitter buffering. Rather than wait until the buffer is full, the audio data elements are clocked, or played, out of the buffer at a rate slower than the normal play rate. In this way, latency due to the initial buffer fill period is reduced or eliminated. Audio data elements continue to be played out at a slower than normal rate until the buffer fill level reaches a threshold. At that time, the play rate for sending data elements out of the telephony input buffer is adjusted to the normal play rate. In an alternative embodiment of the present invention, the fill level of the telephony input buffer is controlled within a desired range by speeding up or slowing down the rate at which audio data elements are played out of the telephony input buffer. In yet another alternative embodiment, the amount of latency jitter in the packet network is measured and the size of the telephony input buffer is adjusted based upon the relative amount of jitter, such that the relative size of the buffer is reduced when the packet network is quiet, and the size of the buffer is increased when the network is relatively jittery.
A messaging system in which a core messaging infrastructure stores and manages messaging attributes, but applications external to the core infrastructure define and modify most attributes. Attribute types may be easily defined or modified, the manner in which attribute values are obtained may be easily defined or modified, and the entity types to which attributes are assigned may be easily defined or modified. The messaging system includes a plurality of messaging entities, such as messages, folders, and users, a plurality of attributes associated with the messaging entities, and a plurality of applications. Each application is operable to examine and modify at least some of the messaging entities and attributes. An application selection device is operable to examine at least some of the messaging entities and at least some of the attributes and to select an application to be invoked, from among the plurality of applications, based on values of the examined messaging entities and attributes. An application invocation device invokes the selected application. The applications may define and modify a type of an attribute and/or may define and modify an association of an attribute with a messaging entity.
In an arrangement where users are connected to an ISP through a bank of modems, a time-out threshold is then selected for the user based on the user's connection pattern. The threshold is varied dynamically in response to access patterns, in an attempt to trade the benefit accrued by using the ISP's modem and phone line for a shorter period of time, against the inconvenience to the user from having to reestablish a connection to the ISP. Specifically, the time interval between the last disconnection by the user and the time of reconnection is evaluated, and when this time interval is shorter than a preselected threshold, then the time-out threshold is increased. When this time interval is longer than the preselected threshold, then the time-out threshold is decreased. Typically, when the time-out threshold is decreased, it is decreased by a significantly smaller amount that the amount by which it is increased, when it is increased.
A method and apparatus are described for inserting a watermark in the compressed domain. The watermark inserted does not require a reference. An overall watermarking system incorporating the invention combines cleartext, bitstream, and integrated watermarking. In a perceptual coder, the data enters a filterbank, where it is processed into multiple separate coefficients. A rate/distortion control module uses noise threshold information from a perceptual coder, together with bit-count information from a noiseless coder, to compute scale factors. The coefficients are multiplied by the scale factors and quantized, then noiseless coded and then output for further processing/transmission. The invention supports three embodiments for inserting a mark into the bitstream imperceptibly. It is assumed that some set of scale factor bands have been selected, into which mark data will be inserted. In one embodiment, a set of multipliers {x.sub.i =2.sup.Ni : i.epsilon.M} is chosen. Each triple is modified by dividing the scale factor by x,.sub.i multiplying the quantized coefficients by {x.sub.i }, and adding mark data to the non-zero modified quantized coefficients. In an alternate embodiment, watermark data is represented via two characteristics of the bitstream data. A Huffinan table is selected for encoding the Scale Factor Band receiving watermark data which is not the table that would normally be used. The watermark data bit is set according to any desired scheme, and the quantized coefficients are derived using the alternate Huffinan table. In another embodiment, watermarking is integrated with quantization. The watermark is therefore difficult to remove without perceptible effects. The fact that marking data is present is again indicated by characteristics of the bitstream data. The modification factors {x.sub.i } are now all close to unity.
A system, apparatus and method automatically update address information of a user's outgoing and incoming messages to/from a communication network thereby relieving the user of the burden of manually entering address changes into a user address book. A plurality of users are coupled through terminals to a server in the communication network for exchanging telephone, CATV, Internet, intranet for messaging, facsimile, etc purposes. The server includes a message store; stored message profile; and is coupled to a change server linked to a network. The change server includes search rules and change options provided by the users in directing the change server in finding correct and alternative address information when erroneous or unknown information is detected in the outgoing and incoming messages. Each user address book includes a series of contacts for each user. Each contact is identified by an identification number, ID, including a name and address. The server detects message headers where a Send To Address is not in the address book. The change server is activated and accesses external databases for correct or alternative addresses in accordance with search rules provided by the user. The alternative or correct address books address information is installed in the users address book and the Send To Message process is executed. For returned messages incorporating erroneous information, the search server is again activated to access the databases for correct address information, after which the user's address book is updated thereby eliminating the time-consuming, irritating manual process of updating the user address book for outgoing and incoming messages.
In an arrangement where users are connected to an ISP through a bank of modems, a time-out threshold is then selected for the user based on the user's connection pattern. The threshold is varied dynamically in response to access patterns, in an attempt to trade the benefit accrued by using the ISP's modem and phone line for a shorter period of time, against the inconvenience to the user from having to reestablish a connection to the ISP. Specifically, the time interval between the last disconnection by the user and the time of reconnection is evaluated, and when this time interval is shorter than a preselected threshold, then the time-out threshold is increased. When this time interval is longer than the preselected threshold, then the time-out threshold is decreased. Typically, when the time-out threshold is decreased, it is decreased by a significantly smaller amount that the amount by which it is increased, when it is increased.
A method and system are provided for performing recorded word concatenation to create a natural sounding sequence of words, numbers, phrases, sounds, etc. for example. The method and system may include a tonal pattern identification unit that identifies tonal patterns, such as pitch accents, phrase accents and boundary tones, for utterances in a particular domain, such as telephone numbers, credit card numbers, the spelling of words, etc.; a script designer that designs a script for recording a string of words, numbers, sounds etc., based on an appropriate rhythm and pitch range in order to obtain natural prosody for utterances in the particular domain and with minimum coarticulation between concatenative units; a script recorder that records a speaker's utterances of the domain strings; a recording editor that edits the recorded strings by marking the beginning and end of each word, number etc. in the string and including or inserting pauses according to the tonal patterns; and a concatenation unit that concatenates the edited recording into a smooth and natural sounding string of words, numbers, letters of the alphabet, etc., for audio output.
The invention concerns a method of generating morphemes for speech recognition and understanding. The method may include receiving training speech, selecting candidate sub-morphemes from the training speech, selecting salient sub-morphemes from the candidate sub-morphemes based on salience measurements, and clustering the salient sub-morphemes based on semantic and syntactic similarities into morphemes. The morphemes may be acoustic and/or non-acoustic. The sub-morphemes may represent any sub-unit of communication including phones, phone-phrases, grammars, diphones, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.
A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice. A method for constructing an efficient concatenation cost database is provided by synthesizing a large body of speech, identifying the acoustic unit sequential pairs generated and their respective concatenation costs, and storing those concatenation costs likely to occur. By constructing a concatenation cost database in this fashion, the processing power required at run-time is greatly reduced with negligible effect on speech quality.
A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice. A method for constructing an efficient concatenation cost database is provided by synthesizing a large body of speech, identifying the acoustic unit sequential pairs generated and their respective concatenation costs, and storing those concatenation costs likely to occur. By constructing a concatenation cost database in this fashion, the processing power required at run-time is greatly reduced with negligible effect on speech quality.
The invention provides a system and method for automatically indexing and retrieving multimedia content. The method may include separating a multimedia data stream into audio, visual and text components, segmenting the audio, visual and text components based on semantic differences, identifying at least one target speaker using the audio and visual components, identifying a topic of the multimedia event using the segmented text and topic category models, generating a summary of the multimedia event based on the audio, visual and text components, the identified topic and the identified target speaker, and generating a multimedia description of the multimedia event based on the identified target speaker, the identified topic, and the generated summary.
A premises, connected to receive broadband service(s) and also connected to a cable system, is provided with a broadband interface which connects to in-premises cabling which is coupled to consumer receivers such as a television sets, PDAs, laptops. Connected to the broadband interface is an adjunct device which channels broadband, data and voice signals supplied to an in-premises wireless system as distinguished from the signals supplied to the cable connected consumer receivers. The adjunct device formats the broadband and voice signals or any broadband service into packet format suitable for signal radiation and couples them to the in-premises coax cabling, via a diplexer, at a first selected location. At a second cable location a second diplexer, connected to the cable, separates the broadband, data and voice signals and couples them to a signal radiation device (i.e., an RF antenna or leaky coaxial cable) which radiates the signal to the immediate surrounding location. Various devices, near to the second cable location for specific services, receive the wireless signals (i.e., broadband, data and voice) from the radiating antenna.
This invention concerns a method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications in a task classification system. The method may include determining whether the user's input communication can be understood in order to make a task classification decision. If the user's input communication cannot be understood and a task classification decision cannot be made, a probability of understanding the user's input communication may be determined. If the probability exceeds a first threshold, further dialog may be conducted with the user. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability exceeds a second threshold, the second threshold being higher than the first, then further dialog may be conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user. This process may be cumulative. In particular, the first dialog exchange may be stored in a database. Then, a second dialog exchange is conducted with the user. As a result, a second determination is made as to whether the user's input communication can be understood can be conducted based on the stored first exchange and the current second exchanges. This cumulative process may continue using a third and fourth exchange, if necessary.
Hardware assisted system and method for computing a visibility ordering of a set of primitives and rendering the set of primitives is described, comprising the steps of and means for locating primitives potentially in a layer and removing occluded primitives from the layer. The hardware assisted locating step further includes the steps of initializing hardware buffers, initializing a layer number, assigning the layer number to each primitive, extracting a subset of the primitives from the set of primitives assigned to the layer number, and storing the subset of primitives in a color buffer. The hardware assisted removing step further includes the steps of reading the color buffer to locate all primitives of the layer number, traversing a pixel array of the subset of primitives to obtain primitive ids and depth complexities, testing depth complexity for each primitive using a stencil buffer, removing those primitives from the layer number if the depth complexity greater than one, re-inserting the primitives with a depth complexity greater than one back into the set of primitives, rendering the primitives of the layer number, incrementing the layer number, determining if any primitives have been extracted from the set of primitives in the layer number, halting execution if no primitives have been removed from the layer number, and repeating all of the above steps.
A Web server maintains, for one or more resources, a respective list of clients who requested that resource. The server takes on the responsibility of notifying all of those clients on when the resource in question changes, thereby letting them know that if the resource is again asked for by a user, an updated copy will have to be requested from the origin server. The server thereupon purges the client list, and then begins rebuilding it as subsequent requests come in for the resource in question. Invalidation messages are sent to selected victim clients on the client list, independent of whether the resource in question has changed, when the list meets a predetermined criterion, such as becoming too large. The victim clients may include clients who access the server less frequently than others, clients who have accessed the server in the more distant past than other clients, i.e., using a first-in-first methodology, or clients who have not subscribed to a service that keeps them from being victim clients. Review of a client list to determine whether it meets the selected criterion can be invoked every time a client gets added to a client list or on a scheduled basis.
A method for alleviating congestion problems in prior art networks delays provision of dial tone signals to terminals that are, likely, carrying out a re-dial attempt in excess of a preselected number of re-dial attempts. A determination that the terminal seeking to establish a connection is likely carrying out a re-dial attempt may be based on numerous factors, such as the time since the last time the terminal desired to establish a call, the duration of the last call, etc. The delay that is imposed is, advantageously, sensitive to the number of times the terminal has attempted a re-dial, and on other conditions, such as the cause of the failure to establish a connection, network congestion conditions, etc. In imposing the dial tone delay, identities of the terminals that are to receive a delayed dial tone are placed in a FIFO queue.